NEXT - New and Emerging Extreme Technology

jeudi 29 janvier 2009

High Performance Simulation & Mission Rehearsal

Mission rehearsal & mission planning are among the most expected usages of simulation ; however, so far, users have been reluctant to imagine that such goals were reachable. Indeed, what makes mission rehearsal so complex is the time frame of the mission. In a matter of hours, the operational unit that needs to prepare the mission has to produce or gather the data, including real time data coming from live sensors (UAV, ...) on the battlefield, prototype the terrain (and order of battle), create the scenario and assess it. Classical use of simulation in terms of mission rehearsal & planning are : technical preparation (e.g. cruise missile flight plan), action modes assessment, immersive visualization.

For a long time, people have been pessimistic about this usage of simulation, stating that it was almost impossible to find the optimal solution in such a short amount of time. Nowadays, mentalities are changing, and this seminar was a good opportunity to assess a few ideas :

Even if the time constraints do not guarantee to identify the best solution, the idea is to find a solution that is "good enough".
The simulation will not yield a solution but will provide the operational with a metric to assess its mission parameters.
Each party should do what it does best: the human will conceive mission plans, the computer will simulate them at high speed ; at the end, the decision will always be human.

High performance simulation is now made possible by the ability to have very powerful architectures (near 1 TFlops) on the terrain, interconnected (if needed) with C4I systems, for a relatively low price. Until now, mission planning and mission rehearsal were secluded in headquarters - our guess is that tomorrow, we will see hardened "battlefield computers" allowing mission rehearsal on the terrain, inside the shelters. Compared to cluster solutions, such low cost, portable, simulation solutions have the advantage of using COTS and not requiring high maintenance. High performance simulation could be the "Swiss Army Knife" of tomorrow's command posts.

jeudi 24 juillet 2008

ISC08: Back to complex programming ?

Supercomputing Europe in Dresden, Germany (ISC08) is over and brought its new announcements in High Performance Computing. This year, two major symbolic milestones have been announced: the first machine over one PFLOPS (10^15 floating operations per second), the IBM Roadrunner, and the first main-stream chips reaching one TFLOPS (10^12 floating operations per second), the GPUs from nVidia and AMD/ATI.

Well, symbols are nothing else than abstract concepts but these symbols remind us that as the computing world evolves faster and faster, both applications and their programmers need to evolve to use these new sources of efficiency.

Compared with last Supercomputing show in Reno, there were fewer application and tools vendors at the exhibit. However, major hardware vendors and many research institutes were attending.

The pre-conference and the Exhibitor Forum were the occasion for interesting technological presentations and announcements. From the hot-chip point of view, here's what we've noticed:

a special session for the IBM Roadrunner breaking the PFLOPS barrier by using both AMD Opteron and Cell CBE processors;
AMD presentation about its new 45-nm 4-core Shanghai for 2008 and the coming 6-core Istanbul in 2009. These chips will also come out of the fab just in Dresden by the way...
Intel presentation on its enterprise-class processor lines

the Itanium 9000
the Xeon MP 7000 and 5000, respectively for an expandable and energy efficient version
the Nehalem as their second generation 4-core 2-SMT processor due for the end of 2008
AMD/ATI presentation on the FireStream 9250, reaching the TFLOPS with 8 GFLOPS/W -many programming environments are available (Brook+, ACML, RapidMind, CAL, OpenCL, HMPP...)
nVidia presentation about the for 240-core T10P with 1.4 G transistors, also reaching the TFLOPS around 6.3 GFLOPS/W
If the hardware is only one (big) part of the HPC, many presentations were focused on how to program such "beasts".

Parallelism has been announced as a the inescapable way for higher performance for more than 50 years (« A Computer Oriented Toward Spatial Problems », S. H. Unger, 1958 for a SIMD computer or « Gamma 60 », Bull, 1957 for a SMT computer). However, up to now, hardware architects have been kind enough to squeeze the Moore's law to just make the classical Von Neumann & Co fast enough for most of the humans.

But now, the Moore's law no longer means faster processors but rather more cores at the same speed... So programs must be parallelized to run faster.

As usual, Thomas Sterling from Louisiana State University made an interesting presentation with a humorous slide to sum up the situation: Core Trek (the next generation?) In Exascale Computing: Space is the "final frontier" To Boldly Code, where no thread has gone before...

So, what can we imagine regarding the future of programming ?

Express more and more things by hand by using old programming languages. This require the programmers to be expert in deep computer architecture whereas they are often experts in their application domains which are more and more complex by themselves. Therefore, we'll have to have teams of experts in both domains: application and HPC (!)...

Another solution would be to rewrite the applications to use higher level languages that can hide the hardware complexity to the programmer but allow more parallelism exposition. Of course, it generaly requires a major - and costly - reengineering of the original code.

Last possibility : go on and program in the classical way... and wait for other solutions to appear. We may have some ideas at HPC Project...

mardi 20 mai 2008

Nouveau mode de fonctionnement

Pour des raisons ... d'emploi du temps, je ne peux maintenir ce blog tout le temps, en français. Je le fais donc converger avec mes posts sur le blog de HPC Project (www.hpc-project.com). Certains posts seront donc en anglais. Voici le premier...

Supercomputing is shaping Formula 1 cars : what’s next? (SEE THE ORIGINAL ON HPC PROJECT Blog)

As the F1 racing season goes on, it becomes clearer that the shape of the cars used in F1 is evolving and can sometimes be counterintuitive. Take the BMW Sauber F1 car – the shape of the front wing and the winglets is conceived to optimize aerodynamics in a new way. The car is clearly different, and evolves between two races.

To achieve such optimization, BMW Sauber uses the services of ALBERT2, a 2048 cores (Intel Xeon) supercomputer, with a maximum power of 12,288 GigaFlops. The supercomputer is used to simulate computational fluid dynamics (CFD), using models of more than 100 million cells.

BMW Sauber is not an isolate case. Today, nearly every team that conceives its own F1 car uses a supercomputer : ING Renault F1 Team uses an Appro Xtreme-X2, 1024 sockets, 4096 cores (AMD QC Opteron) while In the Ferrari Data Center, an Acer/IBM/Racksaver using AMD Opteron processors reduces the time of aerodynamic simulation. Each team either has its own supercomputing infrastructure or uses a partner’s.

And the 2009 season is all about supercomputing either. Intel already announced it would upgrade the existing Itanium-based system with its new Nehalem architecture (not to mention AMD upgrade to Shanghai and Montreal architectures). However, apart from the big IT centers that host the raw computer power needed to analyze and simulate wind tunnel results, a whole ecosystem of “desktop supercomputing” is appearing. Indeed, being able to use ‘local supercomputers’ on the track means a real competitive advantage.

Already, some companies have exclusive contracts with major F1 teams to provide very localized weather forecast services during the GP. In the future, such services could be “embedded” within the racing context. The team would also be able to analyze in real time the data collected during the race, and optimize automatically not only the race strategy, but also the physics of the car and its shape, using new materials or composite architectures.
Desktop supercomputing is not only about the race – the pre-processing and analysis of the results is always CPU intensive, and such systems can be viewed as “proxys” for large supercomputing architectures, allowing the pre-screen of the massive databases generated.

At HPC Project, we are closely monitoring such trends since being able to use, without reengineering, non parallel code on local “desktop supercomputers” could change the face of F1 engineering work. The championship is really about speed but not only about cars.

jeudi 14 février 2008

De la nature à l'ingéniérie

De retour après quelques péripéties (lancer une société est toujours assez prenant...), un nouveau post concernant le supercomputing. BULL a organisé une intéressante session au musée du vin (idéalement situé rue des eaux, à Paris) sur son approche du HPC, et notamment sa conception de l'organisation idéale d'une machine pétaflopique (i.e. capable d'effectuer un million de milliards d'opérations en virgule flottante par seconde).

En raison de limitations des bibliothèques logicielles, notamment, il est difficile aujourd'hui, dans un superordinateur, de gérér plus de 1000 noeuds (1000 processeurs, p. ex.). Or une telle machine doit mettre en oeuvre... 80 000 processeurs!

Le supercalculateur BlueGene d'IBM

L'idée pour pallier cette limitation est donc de réaliser une hiérarchie de parallélisme, c'est à dire de fédérer plusieurs processeurs, agrégés dans des structures intermédiaires. On aurait ainsi 128 processeurs sur une même carte, et des agrégats de 100 ou 1000 cartes (ne dépassant donc jamais 1000 noeuds). Le réseau de communication internoeuds est également optimisé (Infiniband pour les connexions les plus rapides, ou ethernet gigabits pour les noeuds maîtres). On a donc une structure d'ilôts très densément connectés, connectés entre eux par des structures plus "lâches".

Par coincidence (quoique), c'est la théorie que j'avais défendue dans ma thèse de doctorat sur le réseau de régulation des protéines : chaque cellule posséderait un réseau de régulation des protéines formé de structures très connectées, liées entre elles de manière très lâche, permettant ainsi une dynamique adaptée de la cellule. Par la suite, cette organisation a été validée par l'expérimentation.

Si l'on considère donc qu'une cellule est en fait un ordinateur à protéines, on se rend compte que l'ingéniérie a donc redécouvert ce qui semble être l'organisation la plus naturelle pour assurer une efficacité maximale de calcul, et de flexibilité.

mardi 8 janvier 2008

La dynamique de l'argile

Je vous conseille d'aller faire un tour sur le site de la prestigieuse université Carnegie Mellon (CMU). Outre les différents projets en robotique dont nous avons déjà eu l'occasion de parler, le professeur Seth Goldstein a entrepris un projet ambitieux: une "argile électronique", matérialisant à distance grâce à un procédé inédit, un objet quelconque en trois dimensions.

Il s'agit d'assembler des micro-sphères appelées "catoms" pour "claytronic atoms". Chaque catom (voir la photo du prototype ci-dessous) est ainsi un micro électro-aimant programmable de quelques millimètres (44 actuellement, mais les futurs Catoms seront en-dessous d'un millimètre), et constituant ce que l'on pourrait appeler un "pixel 3D".

En effet, sous l'impulsion d'un signal, ces minuscules éléments s'auto-assemblent, tournent, s'attirent ou se repoussent, et sont même capables de changer de couleur grâce à un revêtement de LEDs, leur permettant ainsi d'imiter des textures et des formes. Les applications sont pléthore: design, communication, commerce mais également médecine et recherche.

Le film promotionnel de Claytronics (visible notamment ici) montre ainsi comment un groupe de designers peut, en temps réel, modifier un concept matérialisé sous forme d'une argile hyperréaliste créant le modèle instantanément. Il est intéressant de noter que le projet est également financé par Intel, qui sort ainsi de la conception de microprocesseurs traditionnels, pour s'orienter vers des axes de recherche proches des nanotechnologies.

Cela m'inspire d'ailleurs un commentaire : où en est la France? Une petite recherche sur les dernières publications en nanotechnologies laisse songeur. Malgré l'émergence du pôle Minatec (à suivre), aujourd'hui rien ne permet de dire que l'Europe tient son rang dans la dynamique d'une recherche qui, à mon sens, risque d'être l'une des révolutions majeures en électronique et en informatique de ces prochaines décennies. Tristesse...

samedi 15 décembre 2007

Patience!!!

Le blog revient rapidement, mais ma transition de MASA à HPC-Project ne m'a pour l'instant pas laissé suffisamment de temps pour reprendre les posts. Retour à la normale dans quelques jours!

mercredi 7 novembre 2007

Meet the BOSS

BOSS est le nom du robot autonome, de type SUV (Sport Utility Vehicle) qui vient de gagner le prestigieux prix (doté de 2M$!) du DARPA Grand Challenge. Avant de parler de cette technologie, quelques mots sur le concours lui-même.

Le Grand Challenge est un prix, décerné par la DARPA (Defense Advanced Research Project Agency) américaine, et qui vise à récompenser les technologies de robotique autonome contribuant à placer le combattant hors du danger immédiat de l'espace de bataille, l'armée américaine ayant en effet le projet de robotiser un tiers de sa force à l'horizon 2015. La course en elle-même consiste à parcourir un trajet de 100km en 6 heures, de manière complètement autonome. Cela signifie: rester sur les routes, (il s'agit en fait de routes situées dans le désert californien, autour d'une ville-fantôme), s'intégrer à un "trafic" (véhicules téléopérés, conduits par des humains, ou autres véhicules autonomes), et réagir aux aléas (l'organisation ayant le droit de disposer des obstacles ou de bloquer les routes), en replanifiant la mission tout en gardant une vitesse moyenne de 10mph au minimum. Pour voir les règles officielles, cliquer ici.

Cette course est donc, en quelque sorte, une vitrine technologique, en même temps qu'une expérimentation grandeur nature, qui se situe dans la lignée des grands programmes de robotique autonome (citons en particulier DARTS, dans les années 1990).

Le gagnant de cette année, BOSS, une Chevrolet Tahoe modifiée, est intéressant à plusieurs égards. Construit par la prestigieuse université de Carnegie Mellon (CMU), dont le groupe de robotique autonome est sans doute l'un des meilleurs au monde, il est pour la première fois doté d'une "personnalité", décrite par les évaluateurs comme "agressive mais sûre" ("Boss is kind of like a soccer mom with some place to be – aggressive but safe"). Cette personnalité lui permet de moduler sa perception (fournie par une douzaine de radars, senseurs lasers et caméras) par un module de simulation comportementale lui permettant ainsi de planifier et re-planifier à chaud en fonction des contingences.

On se trouve ainsi pour la première fois confronté en vraie grandeur à l'interaction simulation / monde réel, problématique connue. En effet, le passage d'une simulation à la réalité pose de nombreux problèmes, notamment au niveau de la gestion des différentes informations (bruit dans les capteurs en particulier). Or aujourd'hui, de par le progrès des technologies de simulation et l'essor impressionnant des capacités de calcul (notamment "embarquables"), il devient possible de simuler en temps réel et de confronter le résultat de cette simulation à la réalité. C'est parce que ces nouvelles capacités existent qu'il devient possible d'utiliser des modules de simulation comportementale, qui, auparavant, ne pouvaient être utilisées qu'"off-line".

On assiste donc aujourd'hui à l'émergence de techniques de simulation située ; l'ordinateur, ou le robot, passe son temps à simuler le monde, et à confronter le résultat de cette simulation avec le réel. Le DARPA Grand Challenge, à cet égard, préfigure les nouvelles capacités que nous verrons bientôt apparaître dans nos véhicules, nos terminaux, et notre vie quotidienne.