This section discusses two types of mapping primitives using spatial relations and fiducial points. Ontology-based mappings may use spatial relations, whilst image processing-based mappings may use fiducial points. These two types of mapping primitives are able to determine corresponding anatomical regions across images.

#### Spatial relations as mapping primitives

Spatial relations describe the spatial relationships between spatial entities. The term ‘spatial’ refers to the location in anatomical space occupied by the anatomical entity. The term ‘entity’ refers to the individual anatomical structure such as liver, heart and kidney. Spatial entities can either be material or immaterial. Material anatomical entities are here understood as anatomical structures with positive mass, such as liver and brain, whereas, immaterial anatomical entities are those anatomical structures with no mass, such as the cavity of the stomach [14]. This comparative study aims to identify existing spatial relations to conceptualise spatial entities in an image. Future research is needed to determine the best set of spatial relations necessary to conceptualise anatomical space of an image to guide the mapping process.

Spatial entities share spatial relationships. Spatial relationships include topological, directional and metric relations [15, 16]. These relations can be defined by specifying conditions between entities, such as the distance or the relative position. Topological relations describe topological properties such as connectivity, disjointness and containment between spatial regions. Here, spatial regions are assumed to be parts of an independent background space in which all individuals are located. Eight basic topological relations between two spatial regions according to Egenhofer and Herring [17] are *disjoint, externallyConnected, overlap, contains, equal, coveredBy, inside,* and *covers*.

Metric relations describe the value of the quantitative distance between two spatial entities. Distance can be measured, and it specifies how far is the entity away from the reference entity. Based on distance, relation by means of preposition near or far, as well as adjacency relation, can be defined. For example, near can be defined when the spatial regions, suitably enlarged, have a nonempty intersection. Each spatial region’s width can be enlarged by a fraction of its own height, and vice versa. According to Abella and Kender [18], based on human psychology studies, the value of this fraction is approximately 0.6, particularly, in the case, for long narrow, parallel entities. The relation far, on the other hand, is not the complement of relation near [18]. Far can be defined when the distance between the two enlarged spatial regions *x* and *y*, in either *x* or *y* extent, is larger than the maximum dimension of the two spatial regions in that same *x* or *y* extent. The adjacency relation can be defined between two material anatomical entities that are close, but are not connected. More precisely, the distance between them is small, but non-zero positive distance apart [4].

Directional relations are usually described between two spatial entities that do not overlap [19]. Approximation for these relations can be done by comparing entities representative points (also called centroid) or their minimum bounding boxes. These relations are often described based on cardinal directions between two spatial entities [20]. Work by Frank [21]; Freksa [22]; Ligozat [23] use centroid of spatial entities to define directional relations between two entities. Papadias and Sellis [24] represent each spatial entity using two coordinate points corresponding lower-left and upper-right corner of the entity’s minimum bounding box. Defining directional relations depend on a frame of reference. A frame of reference can be established by assigning a 2D coordinate system to the centroid of spatial entity. The x-axis can then be defined as the west-east axis of the entity. The negative region represents the west of the entity while the positive region represents its east. Doing the same with the y-axis to describe the north and south of the entity, it is then possible to determine directional relations for every spatial entity corresponding to the spatial entity that has the frame of reference. The frame of reference guarantees directional relations between two spatial entities remain the same regardless of their viewpoint. Topological relations are invariant under continuous transformation, such as translation, rotation, or scaling. Directional relations are also invariant under such transformation as a frame of reference can be established [16]. Two spatial entities with a metric distance measure could also change upon scaling but preserve under translation and rotation. Since spatial relations are invariant under continuous transformation, their persistence is fundamental in the process of recognition of anatomical regions in images.

Many existing approaches of image mapping rely on spatial relations between entities of an image. Spatial entities are identified together with spatial relationships among them to represent the image. Mechouche *et al.*[25] present a method to describe spatial relations between sulci and gyri of the brain cortical structure by using the following terms: *anteriorTo, posteriorTo, superiorTo, inferiorTo, lateralTo* and *medialTo*. Hudelot *et al.*[26] present a method to compute the implementation of spatial relations terms such as *right_Of, left_Of, close_to, very_close_to, external boundary* and *internal boundary* to describe the brain cerebral. Du *et al.*[27] present a method which involves topological and directional relations to define some natural language spatial relations. They propose the following directional natural-language terms: *EP* to denote natural language *east part of a region*, *WP* to denote natural language *west part of a region*, *SP* to denote natural language *south part of a region* and *NP* to denote natural language *north part of a region*. These work demonstrate that the recognition of spatial entities depends on entities’ spatial relationships in an image.

Chang and Wu [28] propose a technique called 9DLT matrix which apply nine directional codes to represent spatial relationships. They define directional code as follows: 0 to denote *east*, 1 to denote *northeast*, 2 to denote *north*, 3 to denote *northwest*, 4 to denote *west*, 5 to denote *southwest*, 6 to denote *south*, 7 to denote *southeast*, and 8 to denote *equal*. A single triple *(x, y, r)*, denotes a spatial relation between two spatial entities *x* and *y*. Directional code *r*=0 represents *y* is to the east of *x*, for instance. Subsequently, a set of triples represents an image. Two images are then mapped according to the similarity of their spatial relationships based on the corresponding set of triples. However, the 9DLT matrix has a significant drawback under rotation of direction. Mapping between two identical images, where the first image is 90 degrees rotated version of the second image, though these two images represent the same image, according to 9DLT matrix, these two images do not match as their corresponding sets of triples are totally different due to 90 degrees rotation of direction.

Guru and Punitha [29] propose to address the limitation of 9DLT matrix by modelling directional relations between two spatial entities using a directed line segment. A directed line segment is a line joining between two distinct entities. For example, the line joining the entity *x* to entity *y* becomes the line of reference, and the corresponding direction from entity *x* to entity *y* becomes the direction of reference for the image. The approach computes the direction of the line joining *x* to *y* using Euclidean distance prior to obtain the direction of reference. The relative pair-wise spatial relationships between each pair of entities are perceived with respect to the direction of the line of reference. In order to make the system invariant to image transformations, the direction of reference is conceptually aligned with that of the positive x-axis of the coordinate system. The proposed improvement method by Guru and Punitha [29], successfully overcome the deficiency in 9DLT matrix, however, the method only cover directional information, which means information on topology is lost.

Karouia and Zagrouba [30] propose to represent spatial relationships between two spatial entities of an image using entity relative positioning vector. The set of these vectors provides information about the disposition of different entities of the image. The approach defines this disposition based on five component vectors. These vectors are positioning degree on the left, on the right, on top, below and of inclusion. Each of these elements express a degree of positioning by a numeric value between 0 and 1. This method is intended to represent images containing only isolated entities. Hence, information on topology is not required, as to why the approach does not contain any concept on connectedness among spatial entities.

Zhou *et al.*[31] propose a method called Augmented Orientation Spatial Relationship (also called as AOSR) to describe a range of directions between two spatial entities of an image. Assume that two images *c*1 and *c*2 both have the same entities *x* and *y*; however, the relative distance between these entities in both images is different. If one simply says for image *c*1, entity *x* is at the northeast of entity *y* (according to the centroid of *x* and *y*), then there is no difference between entities *x* and *y* in image *c*2. Therefore, the focus of AOSR is to capture relative distance between spatial entities prior to describe directional relations between them. Though topological information is also not covered in AOSR, Zhou *et al.*[31] claim that the approach may simply be combined with Egenhofers topological representation, to cover for topological information.

Kulkarni and Joshi [32] and Majumdar *et al.*[33] propose a method, which combines both topological and directional relations. However, the method does not capture the notion of distance between spatial entities, such that there is no difference between two entities that are quite near or far to one another.

Wang [34] proposes a method by the use of spatial operator *Σ* to capture interval between the minimum bounding boxes of two spatial entities. This method apparently removes precise spatial description, between entities. The operator indicates there is a space between the two entities that could be either disjoint, near or far. Given a description like *Σ*
*femur*
*Σ*
*metanephros*
*Σ*, it led to spatial knowledge that *femur* and *metanephros* are disjoint, but it led to uncertainty as to whether these two spatial entities are near or are they far to one another.

Yang and Zhongjian [35] propose an image representation structure using the Mixed Graph Structure (MGS). They demonstrate their method on medical images. The method first extracts spatial entities as primitives. These spatial entities are then organised into a mixed graph structure according to their spatial relations. The approach uses only two types of spatial relations, which are *inclusion* and *adjacency*.

Overall, most image description and mapping approaches in [29, 31, 34] use spatial relations of entities in an image. Methods in [32, 33] account on both topological and directional relations of spatial entities. Approaches in [30, 33, 35] represent images as graphs. The graphs conceptualise spatial relations between entities and then solve the mapping as graph matching problem.

#### Fiducial points as mapping primitives

Some image processing-based mappings use fiducial points as the mapping primitive, where these algorithms use a set of fiducial points to determine corresponding anatomical regions between images. Fiducial points are anatomical landmarks in the anatomy that experts use to determine biologically meaningful correspondence between structures [36]. Two images are then aligned to one another by knowing pairs of corresponding fiducial points in each image. These fiducial points are typically located at the contours of the images or points of high curvature like corners of objects, for instance. Because there is currently no standardized set of fiducial points, this comparative study aims to identify examples of fiducial points that have been detected. Further research is needed to determine the best combination of fiducial points necessary to conceptualise anatomical space of an image to guide the mapping process. Getting high accuracy with a large number of fiducial points is not the goal.

Georgescu *et al.*[37] propose a machine learning method to detect fiducial points on a large set of ultrasound heart images in medical databases. These heart images have large variation in appearance and shape. Detection of fiducial points and anatomical regions involved a two-step learning problem – structure detection and shape inference.

Potesil *et al.*[38] and Seifert *et al.*[39] provide recent examples on research work involving segmentation of fiducial points and the corresponding anatomical regions. Potesil *et al.*[38] proposed a method to detect 22 fiducial points based on dense matching of parts-based graphical models. These fiducial points are C2 vertebra, C7 vertebra, top of the sternum, top right lung, top left lung, aortic arch, carina, lowest point of sternum (ribs), lowest point of sternum (tip), Th12 vertebra, top right kidney, bottom right kidney, top left kidney, bottom left kidney, L5 vertebra, right spina iliaca anterior superior, left spina iliaca anterior superior, right head of femur, left head of femur, symphysis, os coccygeum, and center of bladder.

Seifert *et al.*[39] proposed a method for the localization of 19 fiducial points for whole-body scan. These fiducial points are left and right lung tips, left and right humerus heads, bronchial bifurcation, left and right shoulder blade tips, inner left and right clavicle tips, sternum tip bottom, aortic arch, left and right endpoints of rib 11, bottom front and back of the L5 vertebra, coccyx, pubic symphysis top and the left and right front corners of the hip bone. They also have trained ten anatomical region centers – four heart chambers, liver, kidneys, spleen, prostate and bladder.

These fiducial points are useful to estimate anatomical regions that are present, as well as their most probable locations and boundaries in an image [39]. Subsequently, these fiducial points can be used to establish reliable correspondences between anatomical regions across different images.