Persönlicher Status und Werkzeuge

Home Research Multi Joint Vision

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

research:mujovision [2011/08/02 09:36]
tenorth created
research:mujovision [2012/01/23 14:13] (current)
eggers
Line 4: Line 4:
  
 === Sensor Placement === === Sensor Placement ===
-To achieve a redundant survey of the scene, 40 ethernet-connected cameras were installed on a metal scaffolding at ceiling height, approximately 3.5 m above the floor level. The cameras'​ fields of view (FOV) cover the whole experimental area with a top-down view, they were set up to achieve a coverage redundancy of approximately 75 %, which is measured at a height of 1.7 m (the average height of an adult person). ​ The cameras used in the setup are [[http://​www.baumeroptronic.com/​txg08c.html?&​L=1|Baumer TXG08c]] ​ and [[http://​www.baslerweb.com/​beitraege/​beitrag_en_34697.html|Basler Scout scA1000-30gm/​gc]]  ​industrial imaging cameras, which provide images of 1024 x 768 pixels at a rate of 28 to 30 frames per second each. Image acquisition occurs asynchronously over Gigabit Ethernet, using the GigE-Vision (GEV) standard.+To achieve a redundant survey of the scene, 40 ethernet-connected cameras were installed on a metal scaffolding at ceiling height, approximately 3.5 m above the floor level. The cameras'​ fields of view (FOV) cover the whole experimental area with a top-down view, they were set up to achieve a coverage redundancy of approximately 75 %, which is measured at a height of 1.7 m (the average height of an adult person). ​ The cameras used in the setup are [[http://​www.baumeroptronic.com/​txg08c.html?&​L=1|Baumer TXG08c]] industrial imaging cameras, which provide images of 1024 x 768 pixels at a rate of 28 frames per second each. Image acquisition occurs asynchronously over Gigabit Ethernet, using the GigE-Vision (GEV) standard.
    
 {{ :​research:​mujovision:​camera-placement.jpeg?​nolink&​300 |}} {{ :​research:​mujovision:​camera-placement.jpeg?​nolink&​300 |}}
  
 === Network Setup === === Network Setup ===
-The cameras ​are grouped in threes and pairs respectively to form 14 camera groups, each of which is in turn linked via switch ​to one of 14 diskless client nodes, where image capturing and processing is being handled. Cameras with adjacent FOVs are assigned to different camera groups. This helps to compensate for the observed fact that human beings in social scenarios such as the coffee-break demonstration scenario tend to flock together, rather than distribute evenly over the surveyed area, and reduces the likelihood of adjacent cameras becoming unavailable simultaneously in case of problems caused by single processing nodes, thus improving system robustness and load-balancing between the image processing nodes. The diskless client nodes are assembled using off-the-shelf components, using x64 architecture ​quad-core processors and a Linux operating system. They are equipped with two network adapters each. One of these adapters connects the node to the local camera group network, while the other one connects to the client network via a 48-port switch.+Each of the cameras is connected ​to one of 40 diskless client nodes, where image capturing and processing is being handled. Cameras with adjacent FOVs are assigned to different camera groups. This helps to compensate for the observed fact that human beings in social scenarios such as the coffee-break demonstration scenario tend to flock together, rather than distribute evenly over the surveyed area, and reduces the likelihood of adjacent cameras becoming unavailable simultaneously in case of problems caused by single processing nodes, thus improving system robustness and load-balancing between the image processing nodes. The diskless client nodes are assembled using off-the-shelf components, using x64 architecture ​hexa-core processors and a Linux operating system. They are equipped with two network adapters each. One of these adapters connects the node to the local camera group network, while the other one connects to the client network via a 48-port switch. 
    
 === Scope and Challenges === === Scope and Challenges ===
-A variety of perception tasks has to be addressed by the described camera system. A common ​quality ​in all these tasks is that they benefit from a total survey of the scene to be executed effectively. To allow the robots to approach specific persons for interaction,​ humans in the scenario have to be detected and tracked across the whole apartment in real time, without confusing their identities in the process. To allow robots to plan and execute the manipulation of objects, viable candidates such as tools or containers have to be detected and identified. For robot movement planning, the experimental space has to be segmented into traversable and obstructed areas by obstacle detection and floor segmentation.+A variety of perception tasks has to be addressed by the described camera system. A common ​denominator ​in all these tasks is that they benefit from a total survey of the scene to be executed effectively. To allow the robots to approach specific persons for interaction,​ humans in the scenario have to be detected and tracked across the whole apartment in real time, without confusing their identities in the process. To allow robots to plan and execute the manipulation of objects, viable candidates such as tools or containers have to be detected and identified. For robot movement planning, the experimental space has to be segmented into traversable and obstructed areas by obstacle detection and floor segmentation.
    
 Since the camera system is designed to cover the whole area, the challenges start with the scope of the system which has to be designed and integrated. At any single moment, a full-size combined image from all 40 cameras would measure 5120 x 6144 pixels, while the combined data rate generated by the cameras amounts to approximately 7.6 Gbps. Since this exceeds the capacity of a single GigE adapter by far,  the image processing has to be distributed. This incurs challenges regarding the integration of data over all the processing nodes maintaining the cameras, such as the real-time exchange of extracted features to track persons and objects. A common approach to tracking consists of a detection phase, in which a first estimation for the position of an object of interest is derived from an image in a calculatively expensive process without prior knowledge regarding its position, and a tracking phase, in which this object is tracked within successive images by exploiting the knowledge of its position in the preceding images, using a predictive algorithm. For such a tracking approach to be implemented on a distributed multi-camera system efficiently,​ exchange of world position and tracked features between the involved processing clients has to be dealt with to avoid repeated detection phases, and thus improve the performance of the system beyond the one of the sum of its parts. Since the camera system is designed to cover the whole area, the challenges start with the scope of the system which has to be designed and integrated. At any single moment, a full-size combined image from all 40 cameras would measure 5120 x 6144 pixels, while the combined data rate generated by the cameras amounts to approximately 7.6 Gbps. Since this exceeds the capacity of a single GigE adapter by far,  the image processing has to be distributed. This incurs challenges regarding the integration of data over all the processing nodes maintaining the cameras, such as the real-time exchange of extracted features to track persons and objects. A common approach to tracking consists of a detection phase, in which a first estimation for the position of an object of interest is derived from an image in a calculatively expensive process without prior knowledge regarding its position, and a tracking phase, in which this object is tracked within successive images by exploiting the knowledge of its position in the preceding images, using a predictive algorithm. For such a tracking approach to be implemented on a distributed multi-camera system efficiently,​ exchange of world position and tracked features between the involved processing clients has to be dealt with to avoid repeated detection phases, and thus improve the performance of the system beyond the one of the sum of its parts.
Line 20: Line 21:
   * The project is part of the Multi Joint Action demonstrator,​ where we cooperate with the [[http://​www.lsr.ei.tum.de/​1/​forschung/​forschungsgebiete/​robotics/​multi-robot-lab/​|Multi-Robot Lab]] at LSR and the [[http://​cotesys.mmk.e-technik.tu-muenchen.de/​isg/​|Interactive Systems Group]] at MMK.   * The project is part of the Multi Joint Action demonstrator,​ where we cooperate with the [[http://​www.lsr.ei.tum.de/​1/​forschung/​forschungsgebiete/​robotics/​multi-robot-lab/​|Multi-Robot Lab]] at LSR and the [[http://​cotesys.mmk.e-technik.tu-muenchen.de/​isg/​|Interactive Systems Group]] at MMK.
   * On multi-person tracking, we cooperate with the [[http://​www6.in.tum.de/​Main/​ResearchItracku|ITrackU]] project.   * On multi-person tracking, we cooperate with the [[http://​www6.in.tum.de/​Main/​ResearchItracku|ITrackU]] project.
- 
- 
- 
Last edited 23.01.2012 14:13 by eggers