1. Introduction
In-situ observations from robotic explorations, such as the Hayabusa [
1], Hayabusa2 [
2], Rosetta [
3], and OSIRIS-REx (Origins, Spectral Interpretation, Resource Identification, Security, Regolith Explorer) [
4] missions, have significantly contributed to expanding our understanding of the solar system. Subsequently, more deep space exploration missions are planned to be launched [
5,
6,
7,
8]. In these exploration missions, we can expect to obtain a large amount of scientific data relatively easily due to recent technological advances. However, the longer distances between spacecraft and the ground, coupled with limited transmission power constrained by several factors such as antenna size and power consumption, often result in significantly low downlink rates. Smart data-handling strategies must, in this regard, be considered within the constraints of limited downlink rates to maximize scientific returns from future deep space exploration missions.
The limited downlink speed can be one of the biggest issues for the Japan Aerospace Exploration Agency (JAXA)’s Martian Moons eXploration (MMX) mission, which is scheduled to be launched in 2024 [
9]. Specifically, this mission will explore the moons of Mars, Phobos, and Deimos to determine their origins and evolutions. There are several theories on the formation mechanisms of the two moons. The basic parameters, such as their small sizes, low bulk densities, irregular shapes, and low albedos, and the visible to near-infrared reflectance spectra lead to the captured asteroid origin theory, where primitive carbonaceous chondrites originating at the outside of Mars’ system could be captured by the Mars’ gravity [
10,
11,
12]. On the other hand, their orbital parameters, including the low orbital inclinations and eccentricities, and some numerical studies support the giant impact origin theory, where ejecta materials could accrete and form the moons after a giant impact on Mars [
13,
14,
15,
16,
17,
18,
19].
To conclude this discussion about the origins, the MMX mission is planned to perform landings on Phobos (ideally two times on different sites) and collect surface materials (
), which will be returned to Earth in 2029 and analyzed in laboratories [
9]. Although previous missions such as Hayabusa [
1], Hayabusa2 [
2], and OSIRIS-REx [
4] missions performed touchdowns (instant contacts on the surfaces within a second) and collected samples from the surfaces of asteroids, the landing and sampling in MMX mission have become more challenging in that the spacecraft will have to safely land and stay for approximately 2.5 hours on Phobos [
20], where knowledge of the surface is limited. Nevertheless, Phobos is expected to contain plenty of topographic irregularities including slopes [
21], craters [
22,
23,
24,
25,
26,
27], rock particles (i.e., boulders, cobbles and pebbles) [
28], and grooves [
23,
24,
28,
29,
30,
31]. The gravity of Phobos (~1/2000 G [
32]) is also relatively higher than on any other small asteroids on which touchdowns were previously performed (
G [
33,
34,
35,
36,
37]), so the landing site selection could be one of the most crucial phases for this mission.
Based on the current specifications of MMX spacecraft, we need to find flat regions with altitude differences smaller than 40 cm [
38]. Although previous missions, such as Mariner-9 [
39], Viking-Orbiter [
40], Mars Global Surveyor [
28], Mars Express [
41], Mars Reconnaissance Orbiter [
42], and Emirates Mars mission [
43] have observed Phobos and Deimos thus far, the resolutions of obtained optical images and three-dimensional shape models are insufficient considering this severe requirement for the landing sites. While some numerical studies have indicated the capability of finding smooth enough regions for the landing on Phobos [
38], if we establish the safety of the spacecraft during landing as the ultimate priority, then the most practical strategy is performing in-situ observations after the arrival and finding safe regions for the landings.
The MMX spacecraft is thus equipped with two optical cameras: the Telescopic Nadir Imager for GeOmOrphology (TENGOO) and the Optical RadiOmeter Composed of CHromatic Imagers (OROCHI) [
44]. The high-resolution image capabilities of these cameras are critical for landing-site selection. To map Phobos’s surface with a resolution of ~12 cm/pix from a quasi-satellite orbit (QSO) around the moon, the cameras have been designed to exhibit superior performance in comparison to optical cameras on previous sample-return missions, such as AMICA of Hayabusa [
45], ONC-T of Hayabusa2 [
46], and OCAMS of OSIRIS-REx [
47] missions. With a 3296×2472 pixel image size and a 16-bit depth, these specifications are significantly advantageous for minutely observing Phobos’s surface. However, this increased resolution results in a substantial image file size growth, with raw images potentially reaching 16.3 MB. Transmitting such large images to the ground could require over 1.1 hours when utilizing the MMX’s typical telecommunication system with a speed of ~32 Kbps. This downlink duration is not practical given the limited observational period of the landing-site selection, which is scheduled to be performed within approximately half a year and is severely constrained by illumination conditions determined by the relative positions of the spacecraft, Phobos, Mars, and sun [
20]. Additionally, the precise sample collection locations must be determined within the brief 2.5-hour landing window, allowing under 1 hour for downlinks.
One reasonable operation for transmitting such image data is to utilize data compression techniques, with numerous lossy compression algorithms that have been developed to reduce image file sizes significantly. Nonetheless, highly compressed images can become distorted and scientifically unusable. Optimizing image data handling, which will ensure data volume reduction without compromising scientific value, is, therefore, significant for future deep space exploration missions, including the MMX mission. While previous studies have analyzed the influence of data compression on images of exploration missions [
48,
49], to fulfill the requirements of the landing operations of the MMX mission, we mainly focus on the effect of image compression on the accuracy of local DTM generation to show whether such compressions lose any critical information necessary for selecting safe landing sites on Phobos (i.e., ~40 cm altitude differences).
2. Materials and Methods
2.1. Experimental setup
Although previous studies showed the existence of various geological characteristics on Phobos, including craters, grooves, and boulders [
23,
24,
28], the spatial resolutions of the images obtained by previous exploration missions are limited (lower than several meters per pixel), and the detail of the roughness is still unknown [
38]. This knowledge gap will remain until the MMX spacecraft arrives at Phobos and reveals detailed surface structures through high-resolution images. Accordingly, we prepare simulated images of Phobos with both rough and smooth surfaces to accurately evaluate the influence of image compression on the selection of landing sites under various surface conditions of Phobos.
For the simulated images of a rough surface, we use the simulator for the OROCHI, an optical chromatic imager onboard the MMX spacecraft [
44]. This OROCHI simulator has seven wide-angle bandpass imagers, with a pixel pitch of 5.86
and a focal length of 12.5 mm, achieving the iFoV of 0.469 mrad/pix that is similar to the iFoV of the OROCHI (0.44-0.46 mrad/pix [
44]). The seven band-pass filters have wavelengths of 390 nm, 480 nm, 550 nm, 650 nm, 700 nm, 800 nm, and 950 nm, respectively (
Figure 1a). Furthermore, these imagers are mounted on an aluminum frame and can operate simultaneously, allowing the camera to capture the same region on the Phobos surface in seven different colors, which is useful for determining suitable landing sites and identifying the uniformity or nonuniformity of the distribution of surface materials. In this study, we mainly focus on images obtained by the imager with the band-pass filter having the wavelength of 550 nm, a wavelength commonly used in previous exploration missions, including Hayabusa and Hayabusa2 missions [
45,
46]. For the surface material, we used University of Tokyo Phobos Simulant, Tagish Lake Based (UTPS-TB) [
50]. This simulated Phobos regolith material is developed and processed using terrestrial soils and materials, exhibiting a reflectance spectrum similar to that of Phobos surface materials, which allows for the acquisition of images that closely resemble those anticipated from the MMX mission.
The images were obtained with the following experimental setup (
Figure 1b). UTPS-TB was spread over a foundation with a size of approximately
cm. The OROCHI simulator was fixed to a frame to prevent image blurring due to camera shake. This frame can be moved to capture images from different regions using the same imager. Images were taken from a height of approximately 70 cm directly above the foundation. A halogen lamp, whose wavelength domain is not covered by the seven band-pass filters, served as the light source, with an incidence angle (phase angle) of
, an optimal angle for observing surface rocks. Each image has a size of
pixels. Note that TENGOO and OROCHI use a 14-bit A/D converter, while data is stored with a bit depth of 16 bits. To simulate this data structure, we converted 8-bit color images from the OROCHI simulator to 16-bit grayscale images, ensuring that each pixel value fell within the 14-bit depth range. Also, the OROCHI simulator images contained a dark surrounding frame due to the band-pass filters, which we excluded in the post-processing, as explained in
Section 2.3.
With a similar procedure, we obtained simulated images of Phobos with a smooth surface (
Figure 1c). In this experiment, we used a digital camera commercially available to simulate the same image size as OROCHI and TENGOO (
pixels). We used a foundation with a size of
cm having circular depressions, which simulate craters on Phobos. Also, we spread UTPS-TB over the foundation as a surface material. Images were obtained from a height of approximately 188 cm above the foundation. As for the procedure of the simulated images for the smooth surface, we converted 16-bit color images to 16-bit grayscale images to simulate the condition of the TENGOO and OROCHI.
2.2. Image compression and image quality assessment
Image compression algorithms commonly perform two compression schemes: lossless and lossy compressions. In lossless compression, the original image data can be fully reproduced, while the compression ratio is substantially low (i.e., high data volume of a resulting product). Conversely, in lossy compression, the original image data cannot be fully recovered, but the compression ratio can be significantly high (i.e., low data volume of a resulting product). Considering the limited downlink rate of deep-space exploration missions, including the MMX mission, low data volume of a resulting product is important, and thus, we mainly focus on the image-handling strategies using lossy compression.
The image compression algorithm, CCSDS 120.0-B-1 (hereafter simply referred to as CCSDS-120), is specialized for use in spacecraft and is a candidate algorithm that will be used in the MMX mission. Because the complexity of this algorithm is sufficiently low, the processing speed during image handling can be sufficiently high even with hardware onboard spacecraft. CCSDS-120 uses the discrete wavelet transform (DWT) in the compression scheme, and it especially supports two different types of DWT: integer-point DWT and float-point DWT. The integer-point DWT compression process is for lossless compression; it requires only integer arithmetic with lower implementation complexity, but it can also be used in lossy compression. On the other hand, the float-point DWT provides improved compression speed at low bit rates but requires floating-point calculations and cannot provide lossless compression. This study examines both algorithms using integer and float points to determine the best compression scheme.
In practice, we used a public script published by the University of Nebraska (
http://hyperspectral.unl.edu/download.htm) for image compression using CCSDS-120. Note that the compression ratio
in this study is defined as follows:
where
and
are the file sizes of compressed and original images, respectively.
For the evaluation metric to assess the influence of image compression on image quality, we use Structural SIMilarity (SSIM), one of the most widely used image quality indices [
51]. Essentially, the SSIM evaluates image quality based on three comparisons: luminance, contrast, and structure comparisons. SSIM values range from 0.0 to 1.0, where a higher value indicates better image quality. A value of 1.0 shows that a given image is identical to the original image.
2.3. Generation of local DTMs
Structure from Motion (SfM) is one of the most famous and widely used algorithms to generate three-dimensional shape models from images obtained by satellites, drones, and spacecraft, for instance. The planetary science domain has utilized SfM in addition to several other conventional techniques, such as aerial photogrammetric approaches based on stereo-pairs [
52], and small digital topography/albedo maps (L-map) determined from multiple images with stereophotoclinometry (SPC) [
53]. A recent study revealed that the SfM's accuracy is compatible with SPC [
54]. In the MMX mission, SfM is considered a reasonable algorithm for generating local Digital Terrain Models (DTMs). Thus, we use SfM to generate local DTMs.
In practice, we utilized Metashape Pro (ver. 1.6.4), a commercial software program for generating three-dimensional shape models. Leveraging high-end CPUs and GPUs (CPU of Core-i9 12900K and dual GPUs of GeForce RTX 3090), we reduced calculation times. We then explored optimal parameters for obtaining high-precision local DTMs, as shown in
Table 1. Local DTMs were created with Metashape through the following processes:
Import images and image masks (i.e., binary images determining the processing areas) and set camera parameters (e.g., focal length and sensor pixel sizes).
Georeference cameras by importing camera coordinate files, which define positions (coordinates) and orientations (directions) for each camera.
Perform key-point matching by identifying distinctive features in images that can be recognized in other images and matching the most prominent features across the image dataset.
Perform bundle adjustment for three-dimensional geometry reconstruction using the network of matched features, incrementally adding images to update camera model parameters (e.g., focal length, radial distortion parameters) and camera orientations (i.e., positions, directions), and calculating three-dimensional coordinates for key points.
Generate a sparse point cloud representing the three-dimensional coordinates of the most prominent features in the image dataset, realign images with large coordinate errors, and remove outliers by observing the point cloud from various directions.
Build a dense point cloud by calculating depth and color information for each camera.
Generate polygon meshes from the dense point cloud that express detailed topography of the target shape.
Generate image masks by selecting areas on the mesh with high confidence, created from many points in the dense point cloud. Using those image masks, regenerate the model.
Generate the model’s texture by combining the original images seamlessly with the reconstructed polygon meshes.
Note that, to generate local DTMs with high quality, the amount of overlapping area in each image with other images is important. This scenario occurs because the large overlapping area leads to many reconstructed point clouds, resulting in a better quality of local DTMs. Thus, we first created models without any image masks (by Step 7). Then, we selected an area on each model where a large number of point clouds are reconstructed and generated image masks. Using these masks, we processed only portions in each image with large overlaps with others and improved the quality of local DTMs (Step 8).
Author Contributions
Conceptualization, Y.S., and H.M.; methodology, Y.S., H.M., and S.K.; software, Y.S.; validation, Y.S., and H.M.; formal analysis, Y.S.; investigation, Y.S., H.M., and S.K.; resources, H.M., and S.K.; data curation, Y.S., H.M., and S.K.; writing—original draft preparation, Y.S., and H.M.; writing—review and editing, Y.S., H.M., and S.K.; visualization, Y.S.; supervision, H.M.; project administration, H.M. All authors have read and agreed to the published version of the manuscript.
Figure 1.
OROCHI simulator and experimental setups. (a) OROCHI simulator. Each number represents an imager with the corresponding band-pass filters: 1: 390 nm, 2: 480 nm, 3: 550 nm, 4: 650 nm, 5: 700 nm, 6: 800 nm, 7: 950 nm. (b) Experimental setup for simulated Phobos images with a rough surface. The OROCHI simulator model is fixed to a movable frame, enabling the capture of images of the same region from different camera positions. (c) Experimental setup for simulated Phobos images with a smooth surface. For a simulated surface, we use a foundation with circular depressions, resembling the actual cratered Phobos surface.
Figure 1.
OROCHI simulator and experimental setups. (a) OROCHI simulator. Each number represents an imager with the corresponding band-pass filters: 1: 390 nm, 2: 480 nm, 3: 550 nm, 4: 650 nm, 5: 700 nm, 6: 800 nm, 7: 950 nm. (b) Experimental setup for simulated Phobos images with a rough surface. The OROCHI simulator model is fixed to a movable frame, enabling the capture of images of the same region from different camera positions. (c) Experimental setup for simulated Phobos images with a smooth surface. For a simulated surface, we use a foundation with circular depressions, resembling the actual cratered Phobos surface.
Figure 2.
Examples of the simulated images for the rough surface. The image size is pixels and the color space is grayscale. The spatial resolution is 0.379 mm/pix. Note that dark regions surrounded by white lines represent image masks, which show ignored regions while generating the local DTMs. A total of 10 simulated images were prepared for the rough surface.
Figure 2.
Examples of the simulated images for the rough surface. The image size is pixels and the color space is grayscale. The spatial resolution is 0.379 mm/pix. Note that dark regions surrounded by white lines represent image masks, which show ignored regions while generating the local DTMs. A total of 10 simulated images were prepared for the rough surface.
Figure 3.
Examples of the simulated images for the smooth surface. The image size is pixels, and they are expressed in grayscale. The average spatial resolution is 0.112 mm/pix. Dark regions represent image masks, which show ignored regions while generating the local DTMs. A total of 44 simulated images were prepared for the smooth surface.
Figure 3.
Examples of the simulated images for the smooth surface. The image size is pixels, and they are expressed in grayscale. The average spatial resolution is 0.112 mm/pix. Dark regions represent image masks, which show ignored regions while generating the local DTMs. A total of 44 simulated images were prepared for the smooth surface.
Figure 4.
Local DTMs of the rough and smooth surfaces (a) Local DTM for the rough surface. The model is composed of 3,826,280 points and 2,309,668 facets. (b) Local DTM for the smooth surface. The model has 13,265,901 points, resulting in 10,012,252 facets. Those two models express simulated surfaces in a three-dimensional space with a substantially high resolution.
Figure 4.
Local DTMs of the rough and smooth surfaces (a) Local DTM for the rough surface. The model is composed of 3,826,280 points and 2,309,668 facets. (b) Local DTM for the smooth surface. The model has 13,265,901 points, resulting in 10,012,252 facets. Those two models express simulated surfaces in a three-dimensional space with a substantially high resolution.
Figure 5.
Image compression effects on the simulated image of the rough surface. (a) Original image. (b) Images compressed with CCSDS-120 with float-point DWT. (c) Images compressed with CCSDS-120 with integer-point DWT. The difference in the distortion level of the two different compression algorithms is minimal. From this simple visual analysis, distortions caused by image compression are not readily apparent until the compression ratio reaches 98%. Those distortions are basically found on small roughness or textures, which correspond to high-frequency structures in image processing using DWT.
Figure 5.
Image compression effects on the simulated image of the rough surface. (a) Original image. (b) Images compressed with CCSDS-120 with float-point DWT. (c) Images compressed with CCSDS-120 with integer-point DWT. The difference in the distortion level of the two different compression algorithms is minimal. From this simple visual analysis, distortions caused by image compression are not readily apparent until the compression ratio reaches 98%. Those distortions are basically found on small roughness or textures, which correspond to high-frequency structures in image processing using DWT.
Figure 6.
Image compression effects on the simulated image of the smooth surface. (a) Original image. (b) Images compressed with CCSDS-120 with float-point DWT. (c) Images compressed with CCSDS-120 with integer-point DWT. The difference in the distortion level of the two different compression algorithms is limited. Also, from this simple visual analysis, image distortion caused by compression cannot be confirmed clearly until the compression ratio reaches 98%. Such distortions, overall, exist in small roughness or textures, which correspond to high-frequency structures in the image-processing using DWT.
Figure 6.
Image compression effects on the simulated image of the smooth surface. (a) Original image. (b) Images compressed with CCSDS-120 with float-point DWT. (c) Images compressed with CCSDS-120 with integer-point DWT. The difference in the distortion level of the two different compression algorithms is limited. Also, from this simple visual analysis, image distortion caused by compression cannot be confirmed clearly until the compression ratio reaches 98%. Such distortions, overall, exist in small roughness or textures, which correspond to high-frequency structures in the image-processing using DWT.
Figure 7.
Image quality assessment using SSIM. The difference in the image quality decrease between CCSDS-120 with float- and integer-point DWT is consistently minimal across all compression ratios. Our quantitative analysis shows that the quality of simulated images for the smooth surface is more susceptible to image compression, even at lower compression ratios, compared to those for the rough surface. This result represents the characteristic of image-processing using DWT, where small roughness or textures (high-frequency structures) are more affected by image compression. Furthermore, we reveal that compression ratios equal to or lower than 70% exhibit limited loss of image quality.
Figure 7.
Image quality assessment using SSIM. The difference in the image quality decrease between CCSDS-120 with float- and integer-point DWT is consistently minimal across all compression ratios. Our quantitative analysis shows that the quality of simulated images for the smooth surface is more susceptible to image compression, even at lower compression ratios, compared to those for the rough surface. This result represents the characteristic of image-processing using DWT, where small roughness or textures (high-frequency structures) are more affected by image compression. Furthermore, we reveal that compression ratios equal to or lower than 70% exhibit limited loss of image quality.
Figure 8.
Compressed models for the rough surface and their error analysis. The error analysis shows the error of compressed models from the original models. (a) Local DTM generated by using compressed images processed by CCSDS-120 with float-point DWT. The model is composed of 3,872,302 dense points and 2,317,660 meshes. (b) Local DTM generated by using compressed images processed by CCSDS-120 with integer-point DWT. The model is composed of 3,873,461 dense points and 2,303,035 meshes. (c) The error analysis of the model shown in (a). The error of this model is mm, equivalent to the error of cm on Phobos. (d) The error analysis of the model shown in (b). The error of this model is mm, equivalent to the error of cm on Phobos. Both analyses show that the compressed models can be generated within 2% error of the number of dense points and meshes from the original model. Further, the altitude errors can be smaller than 40 cm on Phobos, which means that 70% compressed images effectively reduce file sizes without losing important scientific information, especially for determining safe landing sites.
Figure 8.
Compressed models for the rough surface and their error analysis. The error analysis shows the error of compressed models from the original models. (a) Local DTM generated by using compressed images processed by CCSDS-120 with float-point DWT. The model is composed of 3,872,302 dense points and 2,317,660 meshes. (b) Local DTM generated by using compressed images processed by CCSDS-120 with integer-point DWT. The model is composed of 3,873,461 dense points and 2,303,035 meshes. (c) The error analysis of the model shown in (a). The error of this model is mm, equivalent to the error of cm on Phobos. (d) The error analysis of the model shown in (b). The error of this model is mm, equivalent to the error of cm on Phobos. Both analyses show that the compressed models can be generated within 2% error of the number of dense points and meshes from the original model. Further, the altitude errors can be smaller than 40 cm on Phobos, which means that 70% compressed images effectively reduce file sizes without losing important scientific information, especially for determining safe landing sites.
Figure 9.
Compressed models for the smooth surface and their error analysis. The error analysis shows the error of compressed models from the original models. (a) Local DTM generated by using compressed images processed by CCSDS-120 with float-point DWT. The model is composed of 13,074,155 dense points and 9,848,738 meshes. (b) Local DTM generated by using compressed images processed by CCSDS-120 with integer-point DWT. The model is composed of 12,964,085 dense points and 9,902,717 meshes. (c) The error analysis of the model shown in (a). The error of this model is mm, equivalent to the error of cm on Phobos. (d) The error analysis of the model shown in (b). The error of this model is mm, which is equivalent to the error of cm on Phobos. Both analyses show that the compressed models can be generated within 3% errors of the number of dense points and meshes from the original model. Meanwhile, the altitude errors can be smaller than 40 cm on Phobos. As such, 70% compressed images effectively reduce file sizes without losing important scientific information, especially to determine the safe landing sites.
Figure 9.
Compressed models for the smooth surface and their error analysis. The error analysis shows the error of compressed models from the original models. (a) Local DTM generated by using compressed images processed by CCSDS-120 with float-point DWT. The model is composed of 13,074,155 dense points and 9,848,738 meshes. (b) Local DTM generated by using compressed images processed by CCSDS-120 with integer-point DWT. The model is composed of 12,964,085 dense points and 9,902,717 meshes. (c) The error analysis of the model shown in (a). The error of this model is mm, equivalent to the error of cm on Phobos. (d) The error analysis of the model shown in (b). The error of this model is mm, which is equivalent to the error of cm on Phobos. Both analyses show that the compressed models can be generated within 3% errors of the number of dense points and meshes from the original model. Meanwhile, the altitude errors can be smaller than 40 cm on Phobos. As such, 70% compressed images effectively reduce file sizes without losing important scientific information, especially to determine the safe landing sites.
Table 1.
Parameters used to generate local DTMs with high accuracy using Metashape.
Table 1.
Parameters used to generate local DTMs with high accuracy using Metashape.
Process |
Parameter |
Setting |
Comments |
Align photos |
Accuracy |
Highest |
The program aligns photos with the highest accuracy. |
Generic preselection |
On |
The program makes low-resolution images and finds key points in order to decrease the process time. |
Reference preselection |
On |
The program generates sparse point cloud by using the camera coordinates information input a priori. |
Key point limit |
0 |
Key points will be generated without the limitation of the number of points. |
Tie point limit |
0 |
Tie points will be generated without the limitation of the number of points. |
Adaptive camera model fitting |
Off |
When this parameter is set to be On, the camera parameters for fitting the distortion of the lenses will be determined, which is not necessary in this research. |
Build dense cloud |
Accuracy |
Ultra high |
The dense cloud is generated with the highest accuracy. |
Depth filtering |
Mild |
How aggressively the program filters outliers obtained from the depth computation. “Mild” is recommended. |
Build mesh |
Surface type |
Arbitrary (3D) |
“Arbitrary (3D)” means that the program generates a closed 3D shape model without any holes. |
Source |
Depth maps |
The program generates the mesh using all the information from the input images including assumed depth maps, which is recommended to use. |
Quality |
Ultra high |
The mesh is generated with the highest accuracy. |
Face count |
100,000,000 |
We set the parameter large enough in order to generate meshes without any limitations of the number or meshes. |