Iīm digging into the topic of realtime Solaris at the moment. I canīt talk about the project (itīs not finance, military or robotics, you wouldnīt believe it), but the usage of real-time technology is more common than you think at the moment. Of course the processing of sensoric data, for example in chemical plants, air traffic control radar. Everybody thinks about such stuff at first. But financial systems use this stuff as well. When you work in a business, where miliseconds decide about your profits in automated exchange of stocks and derivates, realtime technology gets essential. You want to be sure, that writing something to disk donīt interrupt the processing of your input data. You can log the stuff later, but the "Sell" order has to go out right here, right now. There are several interesting technologies in Solaris regarding such applications. I will write a tutorial about it soon.
After working trough many documents i asked myself: An UltraSPARC T2+ dual or quad proc system should be a hell of a system for massive parallel processing of sensoric data. Letīs imagine, you have observe the stock value of 128 stocks. You could run the observing code on 128 threads in parallel. No context switches. Okay, substract some threads for house keeping, but letīs ignore this for a moment. Even when the proc has just half the clock frequency, the latency of a single transaction should be much faster.
The following stuff is just a fast thought game ... so correct me, if i got something wrong. And: Yes, iīm aware of the fact that this math is vastly simplified.
Letīs assume your code to get the information, to process it and to make a decision needs 1000 clock cycles. Youīve optimized the code on both system by hand to stay in this cycle budget. Letīs further assume that you need more code on the SPARC but each command is exectuted in 1 cycle (RISC) whereas you have several operations needing more than one cycle in the x86 code (CISC).
On a normal single proc single-socket you could work through 128 threads serially. The latency on an system where the application does the dispatching (aka execute the code for decision serially for each stock and start again afterwards) would lead to a decision to decision latency of 128000 clock cycles.
Even on a quad core, quad socket system you could only observe 16 inputs in parallel. But letīs assume you do this stuff in a single socket single core system for easier calculation. Now letīs assume you work with a 16 core system: 16 cores In 1000 clock cycles you could go through 16 decision processes, 128 divided by 16 results in 8. You need 8 runs of 16 threads in parallel to go through all 128 stocks. Thus the time from one decision regarding a stock to another decision would be would be 8000 clock cycles
Now letīs calculate this for the UltraSPARC T2+. The switch from one thread to another is done without latency penalty. Thus to process 8 observers on a core you would need 4000 clocks as you have two ALU in a core. Thus the latency from decision to decision is just 4000 clock cycles. After 4000 8 decision cycles are totally executed and the next run can start. As you have additional cores, even for 128 stocks your decision to decision latency is 4000 threads. Why 4000? 4 threads are sharing one ALU. Thus you need 4000 clock cycles to execute 4 threads with 1000 commands each even when they look that they were scheduled to 4 different procs.
Thus an x86 based system would twice frequency to do the same stuff in the same time as the frequencywise slower UltraSPARC T2+. Considering that an 256 thread system is planed on the foundation of the T2+ the results get even more in favour of the T2+. The decision latency of for such a system would be still 4000. The time between two decisions for a single stock on such a x86 system would rise to 16000 clock cycles, as you would need 16 runs to go to all 256 stocks.
And this calculation doesnīt factor in high latency events like a L1 or L2 cache misses leading to context switches or idling pipelines. The architecture of the memory subsystem isnīt factored, too. This would give advantages to the T2+ as well, as context switching is an expensive operation in regard of clock cycles on a x86 architecture with fewer cores than software threads. On a T2+ you wouldnīt have to context switch at all, as you would have enough register sets for all software threads.
It may look counterintuitive, but the more inputs you have to observe, the more viable massive multithreaded solutions get, even when they havenīt such a high freqency. I will think more about it to ensure that there is no error in my thoughts.
But while the market grew 18.8 per cent compared to Q2 cy'07 only Sun (34.7 per cent) and NetApp (22.9 per cent) outgrew the market and gained share. The other six vendors all gave up share growing as follows
And we didnīt even announced the OpenStorage based products ... as far as i see it from my perspective and the internally available informations, they will give us a huge opportunity to grow. Interesting times adhead of us.
ZDnet wrote a nice review about the Sun Fire X4450:
For raw power this is the gutsiest server we have seen. The benchmark results are the best we have seen in high end servers particularly the Sungard score although we cannot claim to have tested such a machine with specs quite like this before.
But the best part of the article is a different one:
Too often we look inside a server and think, "what a mess!" Strange air flow and cable routing is common but not so with the X4450. There is hardly a cable in sight and airflow is straight as an arrow from front to back.
To end all questions and rumours about Rock ... this is the presentation held by my colleagues at the recent HotChips conference. Rock is alive and very healthy.
You may have noticed, that the Sun Blade Servers donīt have Ethernet or FC on board. They just have 4 PCI-E connections to the outside world (2 to the NEM slots, 2 to the PEM slots). Sun announced an really interesting new NEM.
24 ports in total, 10 Gigabit-Ethernet Ports, 10 10-Gigabit-Ethernet-Ports and 4 SAS Ports for outside connections to SAS drive enclosures (albeit not supported in the initial release). You will find further specification on the website for the Sun Blade 6000 10GbE Multi-Fabric Network Express Module By using this module every blade has 1 GBE and one 10GBE port and you can put up to two modules in a Blade 6000. So you have a redundant production network connectivity, redundant admin network connectivity and still two free PEM slots for Infiniband or FC.
There is a new resident in my home office. Iīm jumpstarting fangorn at the moment. fangorn is a Sun Ultra 60 workstation (dual-proc UltraSPARC II system with 450 MHz on each proc and 1,5 GB memory). The Ultra60 line had itīs GA in 1998. This was shortly after my first contact with Sun systems. I purchased it for a few euros from a colleague to rescue it from his cellar and to have a multi-proc SPARC system for some testing. I think, fangorn is a good name for such an venerable system.
I should cook tea the whole day. I had an idea what we could do with all this threads in a 4 socket Victoria Falls system while waiting for the water to boil: Letīs build a poor mans fault tolerant system out of it. Itīs an unrefined idea ...
Letīs assume you have to control something: For example a valve in a chemical plant. Itīs an important valve, thus you canīt afford a fault (for example by cosmic rays flipping a bit). Thus itīs a common method to compute something multiple times and compare the result. Two computations isnīt enough. When the results differ, you canīt tell what result is the right one. Or better: Whatīs the result with the highes probability to be correct. The practice suggests to compute it three times and to compare it afterwarts. The result with at least two votes of three in this quorum wins.
You could implement every decision making instance in an LDOM. Thus you could even implement different operating system patch levels in each of the systems.
And this fits with the four procs of a fully blown Victoria Falls system . Three procs for computations and one procs for the comparator. To ensure that every process runs on the same core it was tested, you could bind it to a processor.
This would reduce the thread count to 64 effective processors, but you have implemented a poor-mans fault-tolerant system (poor man because of no hardware lockstepping, it would be just application lockstepping) and well, you have more than enough hardware threads in a Victoria Falls system ...
I will think a little bit more about it in the next few days....
I started this afternoon to write the Kerberos tutorial for the LKSF series. Iīve just put the first fragments in the public and will update them regulary. You can find them here. I will integrate them to the LKSF book as soon iīve finalized it, but this will take really long time. I didnīt even started decribing the Solaris implementation so far. Kerberos is such a vast topic ...
They are somewhat similar, but at the end both represent really different concepts of building a server: Sun announced today the Sun Fire X2250 and the Sun Fire X4250. Both systems are systems in the 2-socket server class, but besides this fact the system are really different.
Sun Fire X2250 Server
The Sun Fire X2250 is a 2 socket XEON system in one rack unit. It can provide room up to 2 disks. It was primarily developed for HPC deployments as a cheap compute node or for highly budget sensitive customers. By the way: It provides an ILOM interface, itīs not equipped with the ELOM.
Sun Fire X4250 Server
The Sun Fire X4250 is a two rack unit, dual socket system. The additional rack unit was used in an efficient manner. You can plug up to 16 SAS harddisks into this chassis. At this moment this leads to a total capacity of almost 2.3 TB raw disk capacity. So itīs the Xeon based brother to the Opteron based X4240. The website specifies itīs an ELOM system, but docs.sun.com already contains the documentation for the X4250 with ILOM.
There are many people, who donīt like the eLOM of our Intel based x86 servers and want the iLOM of the Opteron servers instead. The cure is near: The eLOM to iLOM migration guide already appeared at docs.sun.com
Sun Microsystems, Inc. (NASDAQ: JAVA) today announced the new SPECjbb2005 world record score, which beats all other x86 systems on the market. The Sun Fire X4600 M2 server, equipped with eight Quad-Core AMD Opteron model 8360SE processors and running the Solaris 10 Operating System (OS), posted an x86 World Record score of 683,542 SPECjbb2005 bops (85,443 SPECjbb2005 bops/JVM).
You will find further information at the X4600 Benchmarks page. Matches my impression that the X4600 is one of the big hits from Andy and his teams.
Joerg+M. about Steinmeier Sun, 07.09.2008 16:42 Hmm ... ich habe damit gerechn
et, seit dem Muentefering in d
ie Politik zurueck gekehrt ist
... und die Entwicklung [...]
Observer about Steinmeier Sun, 07.09.2008 15:53 Leihst Du mir bei Gelegenheit
mal Deine Glaskugel...
Joerg+M. about Wanted: A netbook without a camera Sun, 07.09.2008 13:28 Hmmm ... wasnīt sure about tha
t ... i think i will go to my
prefered Apple technican after
talking with that compa [...]
Ein Leser about Steinmeier Sun, 07.09.2008 13:04 Aber ist weniger schlecht unbe
dingt besser?
martin about Wanted: A netbook without a camera Sun, 07.09.2008 13:01 The Macbooks have clear plasti
c in front of the camera modul
e, so there actually is no hol
e. It's not exceptionall [...]
Comments
Sun, 07.09.2008 16:42
Hmm ... ich habe damit gerechn et, seit dem Muentefering in d ie Politik zurueck gekehrt ist ... und die Entwicklung [...]
Sun, 07.09.2008 15:53
Leihst Du mir bei Gelegenheit mal Deine Glaskugel...
Sun, 07.09.2008 13:28
Hmmm ... wasnīt sure about tha t ... i think i will go to my prefered Apple technican after talking with that compa [...]
Sun, 07.09.2008 13:04
Aber ist weniger schlecht unbe dingt besser?
Sun, 07.09.2008 13:01
The Macbooks have clear plasti c in front of the camera modul e, so there actually is no hol e. It's not exceptionall [...]