Calendar
QuicksearchArchivesSyndicate This BlogCategoriesBlog AdministrationLizenz/LicenseDer Inhalt dieses Blogs ist © Copyright 2009 Ralf Ertzinger. Jegliche Reproduktion und Wiederverwertung nur mit schriftlicher Genehmigung des Autors.
The content of this blog is © Copyright 2009 Ralf Ertzinger. |
Tuesday, March 24. 2009iSCSI: Opensolaris target, Fedora initiator, with CHAPFor some reason or another finding instructions on how to actually configure iSCSI for a rather simple and common use case (one target on Solaris, one initiator on Fedora, CHAP authentication) seems to be pretty hard. Either my google-fu is seriously broken today, or everyone except for me considers this to be so easy and obvious that it does not warrant documentation. Which, having done this for the last three hours, I seriously doubt. The Solaris side is actually documented quite well (I primarily used this blog entry by alasdair), the Linux side is lacking. It does not help matters that both sides use identically named tools that work in completely different ways. So, the deal is as follows: One OpenSolaris system, acting as iSCSI target (that is the system presenting the storage space). One Fedora Linux system, acting as iSCSI initiator (that is the system that wants to use the storage space) Create 100GB storage space on the target and let the initiator connect to this storage space and create a filesystem on in. The connection has to be authenticated one way (the initiator presents credentials to the target). The Solaris sideThe system is running Nevada, the configuration here was done with build snv_110. First, the iSCSI target software needs to be installed and enabled: # gkp.pl -d /mnt/Solaris_11/Product SUNWiscsitgtu # svcadm enable iscsitgt:default # The backing store for the iSCSI volumes shall be provided by ZFS:
# zfs create -o canmount=off tank/iscsi
# zfs create -V 100G -o shareiscsi=on tank/iscsi/vol001
# iscsitadm list target -v
Target: tank/iscsi/vol001
iSCSI Name: iqn.1986-03.com.sun:02:218c35e0-0881-cb4f-d6bd-80e08bdc98d8
Alias: tank/iscsi/vol001
[...]
#
The
# iscsitadm create tpgt 1
# iscsitadm modify tpgt -i 212.51.12.90 1
# iscsitadm list tpgt -v 1
TPGT: 1
IP Address: 212.51.12.90
# iscsitadm modify target -p 1 tank/iscsi/vol001
# iscsitadm list target -v
Target: tank/iscsi/vol001
[...]
TPGT list:
TPGT: 1
[...]
#
In order to secure access to this volume some more the initiator is required to authenticate itself using CHAP before access is granted. To do this three pieces of information are needed:
The initiator used here has an iqn of
# iscsitadm create initiator -n iqn.2005-03.com.max:01.cb5c4c lain
# iscsitadm modify initiator --chap-name lain lain
# iscsitadm modify initiator --chap-secret lain
[...]
# iscsitadm list initiator -v
Initiator: lain
iSCSI Name: iqn.2005-03.com.max:01.cb5c4c
CHAP Name: lain
CHAP Secret: Set
# iscsitadm modify target --acl lain tank/iscsi/vol001
# iscsitadm list target -v
Target: tank/iscsi/vol001
[...]
ACL list:
Initiator: lain
[...]
#
This concludes the Solaris side of things. The Fedora sideThe system is running Fedora Rawhide, close to the Fedora 11 Beta release at the time of this writing. On the Linux side iSCSI is handled by the open-iscsi toolchain, packaged as # yum install iscsi-initiator-utils [...] # Since the system in question is a notebook, and the iSCSI target may not be available at all times, the iSCSI service must be instructed not to connect to configured devices automatically: # perl -pi -e 's/node.startup.*/node.startup = manual/' /etc/iscsi/iscsid.conf # The initiator name mentioned in the Solaris section above can be configured freely on the system. A random value is created during package installation and saved in # cat /etc/iscsi/initiatorname.iscsi InitiatorName=iqn.2005-03.com.max:01.cb5c4c # Be sure the name configured there matches the name defined in the initiator on the Solaris side. Even though a username/password pair was defined on the target credentials are not needed for target discovery (the process by which an initiator asks a target which iSCSI volumes are available): # iscsiadm -m discovery -t st -p 212.51.12.90 212.51.12.90:3260,1 iqn.1986-03.com.sun:02:218c35e0-0881-cb4f-d6bd-80e08bdc98d8 # iscsiadm -m node 212.51.12.90:3260,1 iqn.1986-03.com.sun:02:218c35e0-0881-cb4f-d6bd-80e08bdc98d8 # The discovery process has found a single volume exported by the target and added it to the local node list. The iqn matches the value shown above in the Solaris section. Now the CHAP credentials have to be added to the node so the initiator can actually connect to the volume: # iscsiadm -m node --target 'iqn.1986-03.com.sun:02:218c35e0-0881-cb4f-d6bd-80e08bdc98d8' \ --name 'node.session.auth.authmethod' -v 'CHAP' # iscsiadm -m node --target 'iqn.1986-03.com.sun:02:218c35e0-0881-cb4f-d6bd-80e08bdc98d8' \ --name 'node.session.auth.username' -v 'lain' # iscsiadm -m node --target 'iqn.1986-03.com.sun:02:218c35e0-0881-cb4f-d6bd-80e08bdc98d8' \ --name 'node.session.auth.password' -v 'iscsipassword' # iscsiadm -m node --target 'iqn.1986-03.com.sun:02:218c35e0-0881-cb4f-d6bd-80e08bdc98d8' node.name = iqn.1986-03.com.sun:02:218c35e0-0881-cb4f-d6bd-80e08bdc98d8 node.tpgt = 1 node.startup = manual [...] node.session.auth.authmethod = CHAP node.session.auth.username = lain node.session.auth.password = ******** [...] # Now the volume can finally be accessed: # iscsiadm -m node --target 'iqn.1986-03.com.sun:02:218c35e0-0881-cb4f-d6bd-80e08bdc98d8' --login Logging in to [iface: default, target: iqn.1986-03.com.sun:02:218c35e0-0881-cb4f-d6bd-80e08bdc98d8, portal: 212.51.12.90,3260] Login to [iface: default, target: iqn.1986-03.com.sun:02:218c35e0-0881-cb4f-d6bd-80e08bdc98d8, portal: 212.51.12.90,3260]: successful # After some seconds a device node for this volume should appear in # ls /dev/disk/by-path/ ip-212.51.12.90:3260-iscsi-iqn.1986-03.com.sun:02:218c35e0-0881-cb4f-d6bd-80e08bdc98d8-lun-0 [...] # The disk is ready to be used.
Posted by Ralf Ertzinger
in Computer, Linux, Software, Solaris
at
01:16
| Comments (0)
| Trackbacks (0)
Monday, March 23. 2009Using pkgsrc on Opensolaris, Part 3Initial install
The tarball already contains a pkgsrc directory at the top level, so it has to be unpacked under # wget ftp://ftp.netbsd.org/pub/pkgsrc/current/pkgsrc.tar.gz # gtar -xzf pkgsrc.tar.gz -C /usr # chown -R builder: /usr/pkgsrc This will unpack the tarball into the filesystem mounted at Patching gccUnfortunately the gcc suite as delivered with Solaris has a small flaw, which will cause some packages to be built incorrectly (the most famous example is openssl, which will build but not work afterwards). Fortunately the bug has been tracked down and a bandaid is available here. This file is a nice little hack in itself as it is a shell script and a C source file at the same time. If executed by a shell this file will be put though gcc three times to produce three object files, which are placed into a directory where the compiler can find them and use them instead of files that were delivered with the compiler. # bash ./values.c /usr/sfw/bin/gcc [....] # ls -l $(dirname $(/usr/sfw/bin/gcc -print-libgcc-file-name)) [...] -rw-r--r-- 1 root root 763 2009-02-21 17:34 values-Xa.o -rw-r--r-- 1 root root 763 2009-02-21 17:34 values-Xc.o -rw-r--r-- 1 root root 763 2009-02-21 17:34 values-Xt.o # BootstrappingWith this particular bug out of the way # cd /usr/pkgsrc/bootstrap # PATH="$PATH:/usr/sfw/bin:/usr/xpg4/bin" ./bootstrap [....] # Bootstrapping will take a while, but it should run though cleanly. Afterwards there should be some files in Vulnerabilities and updatesPackages need to be kept up-to-date, for new features as much as for possible vulnerabilites. In the latter department the bootstrap installed two programs to help with identifying such programs.
Both programs should be used on a regular basis. Updates of packages (whether due to a vulnerability or not) are best done via CVS. Updates can be done on the whole package tree or on individual subtrees, as needed. Assuming an older installed version the following command will update the whole tree to 2008Q4: # su - builder $ cd /usr/pkgsrc $ cvs up -dPR -rpkgsrc-2008Q4 [...] $ It’s important to check the output of the CVS command for eventual problems, especially if local modifications to packages have been done. Host toolsThere are a number of tools that are used by a large number of packages during the build process. Among these tools are
GNU make and GNU tar are already installed, the rest of the packages can be found on the Solaris install media. The rest can easily be installed: # gkp.pl -d /mnt/Solaris_11/Product SUNWgm4 SUNWgnu-gettext SUNWgpch SUNWunzip \ SUNWbison SUNWflexlex [...] # Now # SUNWgm4 TOOLS_PLATFORM.m4= /usr/bin/gm4 TOOLS_PLATFORM.gm4= /usr/bin/gm4 # SUNWgmake TOOLS_PLATFORM.gmake= /usr/bin/gmake # SUNWgnu-gettext TOOLS_PLATFORM.msgfmt= /usr/bin/gmsgfmt # SUNWgtar TOOLS_PLATFORM.tar= /usr/bin/gtar TOOLS_PLATFORM.bsdtar= /usr/bin/gtar # SUNWgpch TOOLS_PLATFORM.patch= /usr/bin/gpatch TOOLS_PLATFORM.gpatch= /usr/bin/gpatch # TOOLS_PLATFORM.perl= /usr/bin/perl # SUNWunzip TOOLS_PLATFORM.unzip= /usr/bin/unzip # SUNWbison TOOLS_PLATFORM.bison= /usr/bin/bison TOOLS_PLATFORM.bison-yacc= /usr/bin/bison -y # SUNWflexlex TOOLS_PLATFORM.lex= /usr/bin/flex Tuesday, March 17. 2009Using pkgsrc on Opensolaris, Part 2Preparing the system for Compiler environmentThe Solaris core installation does not come with a compiler, but a version of GCC 3 can be found on the DVD (3.4.3 in snv_109). In addition to the compiler itself several other programs are needed to get
The complete command line for installing all this is # gkp.pl -d /mnt/Solaris_11/Product SUNWarc SUNWgccruntime SUNWbinutils SUNWgcc \ SUNWgmake SUNWhea SUNWlibmr SUNWlibm SUNWxcu4 SUNWsprot The compiler is installed into Directories and usersBy default
There is no good reason to change these defaults here. # zfs create -o mountpoint=/usr/pkg rpool/pkg # zfs create -o mountpoint=/usr/pkgsrc rpool/pkgsrc These two commands create two new filesystems on the root pool and set the mount points to the correct directories. The filesystems are automatically mounted and are immediately ready to be used. Having a special user for building packages instead of using root is a good idea in general. It protects the system from eventual errors in the build system, which might cause files to be written outside the build root (even if those files are completely harmless and not malicious they are a nuisance nonetheless). The build user should also have no special privileges on the system. # useradd -d /export/home/builder -m -s /usr/bin/bash -c "pkgsrc build user" builder Thursday, March 12. 2009Using pkgsrc on Opensolaris, Part 1Even though Solaris comes with a large selection of software these days (and some of it in recent versions) there still will be software that is not on the install media. In my case the greatest itch was rtorrent. I was not too keen on the idea of building everything from scratch the classic way ( All this sounded a lot like the FreeBSD ports system to me which I found to be unavailable on Solaris. But the general idea was sound, and in the immediate neighbourhood I discovered the NetBSD
On the other hand it can be instructed to use the compiler and tools supplied by the guest operating system, if those are available and recent enough, so that just software that the guest system does not supply is built by
Wednesday, March 11. 2009Building an OpenSolaris storage - Software, Part 6Since I tend to forget these things, here’s a short list showing various package management tasks.
Posted by Ralf Ertzinger
in Computer, Linux, Software, Solaris
at
20:50
| Comments (0)
| Trackbacks (0)
Friday, March 6. 2009Building an OpenSolaris storage - Software, Part 5Some releases ago Solaris introduces a Role Bases Access System (details and introduction here). The upshot of this is that Profiles can be attached to a user account, giving this account elevated privileges for some tasks. The main administrator account should probably be able to execute all command with root privileges. Linux/BSD systems use Almost all tutorials about # gkp.pl -d /mnt/Solaris_11/Product SUNWwbcor [...] # usermod -P'Primary Administrator' admin # su - admin $ id uid=500(admin) gid=100(users) $ pfexec id uid=0(root) gid=0(root) The admin user can now execute every command as root without needing the root password (or any password, for that matter) simply by prefixing it with Building an OpenSolaris storage - Software, Part 4After the newly installed system has booted for the first time it is time to make Solaris a bit more homely. This involves installing a number of packages, which brings me neatly to the first thing that annoys me (and probably a lot of other people used to current Linux distributions): the package management is an absolute mess. Most Linux distributions take a two-tiered approach to package management: there are low level tools that allow installation of one or more packages, reading from a local filesystem, removal of packages, listing information about packages and so on. RPM is an example of this, as is DPKG. Then there are high level tools that know about local and remote collections of packages, can resolve dependencies for package installation and removal. YUM, APT and PackageKit are examples of these tools. Unless something is severely broken the high level tools are used for package management in the usual cases. Noone actually wants to do all the manual dependency solving and package downloading themselves, that’s what computers are for, after all. Unfortunately all Solaris provides are tools of the first category. While I understand that changing the low level tools is impossible for binary compatibility reasons I do not see a reason why high level tools for handling all the grunt work are not being provided. The logical response to this dilemma is to write a rudimentary high level tool which takes away some of the pain. My version is written in Perl (which, astoundingly, is part of the core install, and not even insultingly old). Currently all it can do is install packages. It relies on the dependency information in the Solaris packages, which is spotty at best. So, lets make this system a bit more habitable. mount -F nfs 10.200.200.1:/jumpstart /mnt gkp.pl -d /mnt/Solaris/Product SUNWgcmn SUNWwgetr SUNWwgetu SUNWntpr SUNWntpu SUNWgtar \ SUNWsshcu SUNWsshr SUNWsshu SUNWsshdr SUNWsshdu SUNWbash SUNWless SUNWdoc SUNWman \ SUNWgnu-coreutils SUNWtoo Among other things this will install
Setting the root shellI probably won’t make many friends here, but I like bash. And I also like a root shell I can actually work with. So, root gets a bash. Contrary to what other people might say this is an absolutely harmless change on modern Solaris systems. usermod -s /usr/bin/bash root Configuring SSHWorking on the console itself is tedious, so getting SSH up and running is important. Even though the packages are installed they are not configured or enabled yet. First, though, I usually define a special group to which accounts which are allowed to ssh into the system are added. Then the keys are generated, and finally ssh is enabled. groupadd sshusers echo "AllowGroups sshusers" >> /etc/ssh/sshd_config /lib/svc/method/sshd -c svcadm enable ssh Security settingsBy default Solaris creates perl -pi -e 's/^CRYPT_DEFAULT=.*/CRYPT_DEFAULT=1/' /etc/security/policy.conf This will not change existing accounts. Only newly created password hashes are affected. This means that the root password should be set again, to create a new hash. Adding normal usersWorking as root is a bad idea, so adding normal users is the order of the day to do normal work on the system. I prefer to have a special group containing all normal users, so this has to be added as well. In addition, giving every user a separate ZFS filesystem as home directory is a good idea. The dataset Since this is going to be a storage system without a lot of users there is no big problem with groupadd users zfs create rpool/export/home/admin useradd -g users -G sshusers -d /export/home/admin -s /usr/bin/bash \ -c "Admin User" -m admin chown -R admin:users ~admin The user becomes a member of the Sunday, February 22. 2009Building an OpenSolaris storage - Software, Part 3Installing Solaris can be a strange experience for someone who is only used to modern time Linux installers. Yes, there is a graphical installer, but it consists of little more than an X windows which basically asks the same questions as the text mode installer. Unless you already know how to install Solaris, and what the installer expects of you some of the questions and dialogs seem a little strange. Due to Solaris’ focus on binary compatibility some of the defaults don’t make that much sense anymore, either, but changing them to more sensible defaults would cause confusion, or so it seems. For the install on the storate system, though, most of the defaults are sensible, and since Solaris does not have to share any disks withother operating systems the partitioning process is not that painful, either. The first question the installer asks (always in text mode) is about the general installaton mode the user wishes to perform (roughly graphical/text based or rescue shell). Interactive/text mode (option 4) is usually fine. If the system has booted from the network the installer will not ask about IP configuration for the network cards but assume DHCP for IPv4. The question about the name resolution service is one of the odd quirks in the installer. The naming service defaults to NIS, which is probably wrong for almost any new installation on this planet. Usually DNS is the right choice here. The installer will then ask for the DNS server IPs and default domains. If the installer can not resolve the current machine IP via these nameservers it will explicitly ask for confirmation that the data is really right. The default answers for the next questions (Kerberos/NFS4) are sensible in the usual cases. When asked for the file system to use for the root filesystem the default is UFS. Change it to ZFS. I prefer to use separate datasets for The (almost) final question is for the amount of packages to be installed. The installer offers five predefined groups, ranging from several hundred to almost three gigabytes of installed data. Selecting the smallest set will do fine here, the system will boot, have network and NFS client support, which is enough to get at the rest of the packages to install later. That’s it, basically. The installer will now copy the files to the boot disk, prepare the bootloader and restart the system. Monday, February 16. 2009Building an OpenSolaris storage - Software, Part 2After solving the cache issue on the MSI board it was time to install the OS for real. I had planned to do the install via the network, for two reasons.
Installing Solaris via the network has been supported for a very long time, and Solaris being what it is the process has not changed very much. That means that there are some quirks in it. Because this was the first Solaris installation in my network some non-Solaris machine had to take over the job of providing the various services needed for an installation. The job was delegated to my notebook running Linux. The following services are needed to install Solaris over the network:
In addition to that some software:
First and foremost, though, the network card in the system that is to be the target of the installation needs to support PXE. PXE is a method that defines a way for a network card BIOS to obtain an IP address via DHCP and load a piece of software from the network, which is then executed by the system. In addition PXE provides a handful of library functions that give the just loaded software a way to talk to the network itself (to load even more software, for example). Most modern network cards and BIOSes support PXE and booting from the network. If this is not the case the nice people over at etherboot.org have a large library of network card specific code that can be booted via a floppy disk or a bootable CDROM which will provide the network card with the appropriate capabilities. A dedicated network will be used for the installation, namely 10.200.200.0/24. The install server has the IP 10.200.200.1. Preparing the tftpboot directoryThe TFTP server will serve it’s files from the
/tftpboot/
|-- mboot.c32
|-- pxelinux.0
|-- pxelinux.cfg
| `-- default
`-- solaris
|-- platform
| `-- i86pc
| `-- kernel
| `-- unix
`-- x86.miniroot
The DEFAULT jumpstart LABEL jumpstart KERNEL mboot.c32 APPEND -solaris solaris/platform/i86pc/kernel/unix -v -m verbose -B install_media=10.200.200.1:/jumpstart --- solaris/x86.miniroot The parameters describe the path of the kernel image under the TFTP server root ( The two files in the Preparing the DHCP serverThere is not much to this, really, all that is required (besides the obvious IP address and netmask) is the IP address of the TFTP server and the filename of the SYSLINUX bootloader in the TFTP directory structure. The complete config file (for the ISC DHCP server) looks like this:
subnet 10.200.200.0 netmask 255.255.255.0 {
range 10.200.200.128 10.200.200.200;
option routers 10.200.200.1;
option subnet-mask 255.255.255.0;
next-server 10.200.200.1;
filename "/pxelinux.0";
}
Preparing the NFS serverThere are several ways to present the contents of the install media via NFS, but for this install the method that worked best for me was to simply mount the ISO into a directory and share that via NFS. # mkdir /jumpstart # mount -o ro,loop /tmp/nv105.iso /jumpstart The entry in /jumpstart *(ro,no_subtree_check,sec=sys) Putting it all togetherThat should be it, basically. If the machine is turned on and set to boot from the network the following chain of events will take place, provided all goes well:
Friday, February 13. 2009Experiences with the MSI supportI have to say that I am very pleased with the technical support I have received so far from MSI for my IM-GM45. Besides the MTRR issue I wrote about here, I had filed a second, minor request. The second request regarded the shared video memory setting for the on-board graphics chipset. In the original BIOS the minimum amount of memory that could be allocated for video RAM was 32MB, which is way too much for the text mode that I need. So I filed a request asking for the possibility to select a lower amount of memory (preferably 1MB or less). Two business days later I received a mail from the MSI support containing a BIOS with the fix for my specific request (fortunately it also contained the MTRR-fix). This is admittedly much more than I expected. Thanks to MSI for the quick and helpful response to both of my requests. Sunday, February 8. 2009Building an OpenSolaris storage - Hardware, Part 2Earlier this week the last of the hardware I ordered arrived, so I could finally assemble the whole system. Contrary to my expectations the CPU did not come with a fan (which was just as well, as I already had two), but with a lot of packaging instead. Someone at Intel should think about cutting down on all that plastic just to ship a tiny piece of silicon.Speaking of silicon, contrary to almost all other current x86 CPUs the Core2Duo Mobile processors do not have a metal cap to protect the die, but the die instead sits rather unprotected on top (similar to the Athlon XP and Pentium 3 processors). This makes attaching a fan an interesting experience, because it is quite easy to damage the die while doing this. Coolermaster is obviously aware of this, the contact side of the fan contains a foam spacer which surrounds the die when the cooler is placed on the CPU and which is supposed to prevent tilting. The cooler is then fastened to a mounting plate sitting on the bottom of the board using some spring screws. This works insofar as I was able to mount this without damaging the die.Getting all this into the case is a bit tricky but manageable, as the mainboad tray can be pulled out from the case. The case has qute an assortment of LEDs and switches, unfortunately not all of these have corresponding connectors on the board (the two LAN-LEDs, the ERROR LED, the Mute switch and the intrusion detection switch). The SATA cables are numbered which makes it easy to plug them into the right connector, so the mainboards view of drive numbering lines up with the numbers on the case. After putting all this together I switched the system on for the first time. All went well and the BIOS came up. The system is not exactly quiet, but I find the noise far more bearable than the Thecus one, mainly because most of the noise is airflow, and not the droning of the fans. The noise level is constant, too, so far I have not heard the case fans increase speed. Next up: installing software Friday, February 6. 2009Building an OpenSolaris storage - Software, Part 1After the hardware was assembled I made a quick attempt to boot Nevara 105 from a DVD in order to see how things went (the installation proper will be made via the network). The system booted, but was very slow. It took over 10 minutes to get to the first prompt (which asks about the kind of installation you want to perform, and which is usually reached in a few seconds). Older releases and OpenSolaris 08/11 behaved the same. A Linux system booted from an USB stick behaved normally, though. I wrestled with this for two days, but then noticed something while running memtest86+ on the system. Since the system has 4GB of RAM, and quite a lot of the physical address space between 3 and 4 GB is used by PCI devices, quite a lot of physical memory is remapped to physical addresses above the 4GB mark. memtest shows the start and end address of the block it is currently testing (this is why start and end addresses above 4GB can show up there, even if the system has less than 4GB of RAM). While testing the relocated memory block memtest slowed to a crawl, while the memory below 4GB was tested at normal speed. It looks like memory accesses above 4GB are not covered by the processor cache. Linux seems to put it’s kernel below this magic mark, and thus runs normally, while Solaris lands above it, and is less than usable. Memory layoutIn order to explain what is going on here (and why it is bad) a small detour is in order. Physical address space is a shared resource on most architectures, the Intel x86 platform (in 32 and 64 bit) included. It is shared between real, physical memory and IO memory. Physical memory is the kind that comes (usually and these days) in DIMM memory sticks that go into the appropriate slots on the main board. IO memory, on the other hand, is a way of talking to extension cards, for example network adapters and graphics cards. These adapters register one or more areas of memory with the BIOS during system startup. Accessing these adresses results in reading or writing to these extension cards instead of real memory. The use of this is that programs can treat extension cards just like normal memory. The end result of this is that a given physical memory address can have one of three “backgrounds”:
The memory ranges that devices register usually live between the 3GB and 4GB physical address space. This was all well and good as long as practically no system had that much real memory, so there was no contention for address space. However, two things happened in the last years: memory got incredibly cheap, and devices got more hungry for memory space. Modern graphics cards, for example, map a large chunk of their on board graphics RAM into the memory space, sometimes all of it, easily taking up half a gigabyte or more of address space. There are two ways to handle address conflicts in this situation. The easy way is to simply ignore the physical memory in the address ranges claimed by devices. The physical memory cell becomes inaccessible, and the storage it provides is lost. This is obviously not a popular solution. The other way is to relocate the physical memory from the contended address spaces into non-contended space. This usually means “above the 4GB border”. So although you may only have 4GB of physical RAM in the system some of it must be accessed at addresses above 4GB. The exact layout of the memory is passed from the BIOS to the system in the so-called E820 memory maps. On the MSI IM-GM45 board with 4GB of memory installed it looks like this: BIOS-e820: 0000000000000000 - 0000000000099000 (usable) BIOS-e820: 0000000000099000 - 00000000000a0000 (reserved) BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 00000000bdc80000 (usable) BIOS-e820: 00000000bdc80000 - 00000000bdc8e000 (ACPI data) BIOS-e820: 00000000bdc8e000 - 00000000bdcd0000 (ACPI NVS) BIOS-e820: 00000000bdcd0000 - 00000000bdce0000 (reserved) BIOS-e820: 00000000bdcec000 - 00000000bde00000 (reserved) BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved) BIOS-e820: 00000000ffb00000 - 0000000100000000 (reserved) BIOS-e820: 0000000100000000 - 000000013c000000 (usable) The lines marked with The last line shows the physical memory that has been relocated above the 4GB memory barrier. Almost one gigabyte of physical memory has been relocated. Processor cachesThese days there are multiple layers of caches between the CPU core and the main memory. Data read from memory is held in the caches, for it might be needed again soon, and data written to memory is held for the same reasons. Data written to memory addresses beloning to devices, however, may or may not be eligible to caching. While memory is expected to keep it’s content stable (unless explicitly written to) devices may change the content of their memory maps as they see fit, and caching the values read or written would screw with the CPUs world view. CPUs therefore contain a list of memory ranges and the cache policies associated with those ranges. These lists are called memory type range registers (MTRR) on Intel CPUs. Below is the list from a different system, also with 4GB of physical memory: reg00: base=0x100000000 (4096MB), size= 512MB: write-back, count=1 reg01: base=0x120000000 (4608MB), size= 256MB: write-back, count=1 reg02: base=0x00000000 ( 0MB), size=2048MB: write-back, count=1 reg03: base=0x80000000 (2048MB), size=1024MB: write-back, count=1 reg04: base=0xc0000000 (3072MB), size= 256MB: write-back, count=1 reg05: base=0xcff00000 (3327MB), size= 1MB: uncachable, count=1 Three quarters of the memory range between 3 and 4GB do not appear on this list, and are thus considered uncacheable by the CPU. The memory above 4GB, however, is marked as cacheable ( Now the same list on the MSI system: reg00: base=0x00000000 ( 0MB), size=4096MB: write-back, count=1 This is wrong in several ways, but the most important one is that the physical memory above 4GB is no longer cached by the CPU. The result of that is access to that memory becomes painfully slow. This also explains why the problem goes away when only 2GB of memory are present: no physisical memory is relocated beyond the 4GB barrier, because not enough address contentions exist, so all physical memory is being cached again. I have filed a support request with MSI and hope someone there understands the problem. [Update 1]After a short discussion about supported operating systems, MSI has sent me a newer BIOS file. I’ll try that this evening. [Update 2]The new BIOS does indeed fix this issue. The new MTRRs look like this: reg00: base=0x13c000000 (5056MB), size= 64MB: uncachable, count=1 reg01: base=0x00000000 ( 0MB), size=4096MB: write-back, count=1 reg02: base=0x100000000 (4096MB), size=1024MB: write-back, count=1 reg03: base=0xc0000000 (3072MB), size=1024MB: uncachable, count=1 reg04: base=0xbdd00000 (3037MB), size= 1MB: uncachable, count=1 reg05: base=0xbde00000 (3038MB), size= 2MB: uncachable, count=1 reg06: base=0xbe000000 (3040MB), size= 32MB: uncachable, count=1 There are still some weird edges about this, but by and large it does what it should: all memory (minus that for the onborad graphics) is cached now. My thanks to the MSI support for the fast (and working) response. Sunday, January 25. 2009Reading OSX install media under LinuxIf you ever tried to take a look around an OSX install DVD unter Linux you may have been surprised by $ du -sh /media/OSX86DVD/ 132K /media/OSX86DVD/ There is no magic here, as seen by the fact that the same disk inserted into a running OSX system shows a completely different file structure with more than 4GB of data. The explanation for this is that there are two filesystems on the DVD: First, a normal ISO9660 CD file system, which contains some bootloader files, and not much else. This filesystem is the one that Linux (and Windows) see by default. Behind the ISO9660 filesystem there is another one, which spans the rest of the disk. This is a complete hard disk image with it’s own partition table, and which contains the real installer data. As Linux does not expect this it does not try to access this filesystem, and the files remain invisible. In order to get at the files in the second part of the disk some command line magic is necessary. You will need root privileges for the following operations. I assume that the DVD device is The Linux kernel has a nice block device mapping layer which allows us to take a slice out of an existing block device and present this slice as a new device. Even more, there is a helper tool that inspects a block device, looks for partitions and automatically creates such a mapping for each partition. This tool is called So we use # kpartx -a /dev/sr0 device-mapper: reload ioctl failed: Invalid argument create/reload failed on sr0p1 device-mapper: reload ioctl failed: Invalid argument create/reload failed on sr0p2 # That did not work too well. For some reason the device mapper does not like CDROM/DVD devices. So we’ll have to get a bit inventive. We create a loop device which is backed by the DVD device, and use that. # /sbin/losetup /dev/loop0 /dev/sr0 # /sbin/kpartx -av /dev/loop0 add map loop0p1 (253:5): 0 63 linear /dev/loop0 1 add map loop0p2 (253:6): 0 9178688 linear /dev/loop0 448 # The command found and created two partitions. The first partition is the ISO9660 file system at the beginning of the disk, the second one is the partition we are interested in. Let’s look what’s in it. # file -s /dev/mapper/loop0p2 /dev/mapper/loop0p2: Macintosh HFS Extended version 4 data last mounted by: '10.0', created: Tue Oct 30 00:32:01 2007, last modified: Wed Dec 19 16:45:14 2007, last checked: Tue Oct 30 00:32:01 2007, block size: 4096, number of blocks: 1147336, free blocks: 192685 # That looks good. We could now mount the filesystem and look around. # mount /dev/mapper/loop0p2 /mnt # ls /mnt Applications Desktop DB dev Install Mac OS X.app mach_kernel sbin tmp vanilla Volumes bin Desktop DF etc Library private System usr var # To get rid of all this again we have to unmount the file system, destroy the mappings and release the loop device: # umount /mnt # kpartx -d /dev/loop0 # losetup -d /dev/loop0 Saturday, January 24. 2009Building an OpenSolaris storage - Hardware, Part 1Today the first half of my order arrived: the case, the mainboard and the fan. As this delivery was somewhat unexpected (the mainboard is a new model, and I had expected it a week later), I now have new hardware I can not use, due to the lack of a CPU. This is a bit embarassing, but cannot be changed right now, so I’ll start with what I have. My apologies for the appaling quality of the pictures, but all I have in the way of digital image capture is the camera in my cell phone. The CaseThe first thing I noticed about the case is how solid it looks and feels. Contrary to the photos I had seen so far the case is not all black, but the sides and the top and bottom are a dark siverish grey. The front, however, is black. I have to say it looks quite good. Although the front and sides are plastic, there is absolutely nothing cheap about the feeling. This is underlined by the weight of the thing. Even though there is no power supply in it (that is external), and no parts have been mounted yet, the empty case weighs over seven kilograms. Under the plasic outside panels there is a massive metal cage. It really has a no-nonsense feel to it. The inside is full of pre-routed cables that lead to the front panel, the power distribution plane and the hard disk backplane. The case has a 20+4-pin ATX power connector plus the four pin additional CPU power connector most current boards need. Luckily, all the MSI board requires is the 20-pin ATX connector. A pleasant surprise (aside from the colour) was that the case has a multi-format card reader already built in. From the manufacturers site I had gathered that this was an optional extra, but my case came with one included. Also included is a CPU fan, but that is quite specific for a certain mainboard, and will not fit most other boards, so it is useless to me. What is not included is a manual, at least I have not found any. As all of the cables are clearly labeled this is not much of a problem, though. The BoardThe mainboard comes with the usual assortment of cables (2xSATA, 2xSATA power adaptor, 44-pin IDE) and the rear panel bracket. In addition, it also contains a CPU fan, which surprised me. It also is a Coolermaster model, but not the same I ordered extra. If, as I suspect, the CPU also comes with a fan I’ll have quite enough of those things. The board also had a pleasant surprise, this one on the bottom of the board. MSI put a CF card socket there, which the web site stated as an optional extra. I think I’ll use a CF card instead of the notebook hard disk, as this produces less noise and heat. The FanWell… it’s a fan, right? Goes on the CPU, and hopefully does not make too much noise. I can always threaten it with the other fan if it does. Building an OpenSolaris storage - HistoryThis is supposed to be a documentation of my endeavour to build an OpenSolaris based storage machine for my home use. Coming from a Linux background myself it will also serve as a notebook of how to do stuff under Solaris. HistoryFor years I had a midi tower based system running, which was both my internet router and the local storage machine. This machine was shut down eventually, energy prices being what they are, and was replaced by an ASUS router running OpenWRT for internet access. The storage facility was not replaced, so only the hard drives on the client machines themselves were left. In the middle of last year I finally had enough of that and started looking around for a small storage appliance. I wanted something I could play around with, so being able to screw with or replace the original operating system was a must. I ended up with the Thecus N2100, which is a small, ARM based NAS enclosure running Linux from embedded flash, and able to house two SATA hard disk drives. It’s possible to get a fully functional Debian system on it if you’re not afraid of poking around with serial ports (which I am not), so it seemed like a good choice. It was ordered together with two Seagate 7200.11 1TB drives. It turned out pretty fast that the Thecus and the Seagate drives did not like each other a whole lot, which is probably due to the rather high spinup current that the Seagate drives need (3A on the 12V rail). This was more than the Thecus could provide, so the drives did not spin up most of the time. So the two Seagate drives were replaced with two Samsung 1TB drives, and the Seagates were banished to the shelf. In hindsight this was probably a good thing, because a) the Seagates did not have time to fill up their log and run into the current firmare bug, and b) I had 4 1TB drives lying around, which would come in handy later. The Thecus liked the Samsung drives a whole lot better, and the original firmare was quickly replaced with a Debian Lenny distribution. From a purely administrative standpoint all this worked very well, the distribution detects all the hardware in the system (not that there is a whole lot of it, but nonetheless), including the multi-coloured LEDs in the front panel and the fan controller. The main problem with the Thecus was speed. The system is equipped with a 600MHz ARM processor, which sounds quite beefy, especially compared to the other NAS storage enclosures out there, which usually have less. As this was unsatisfactory a new solution was needed. In the mean time I had played around with OpenSolaris (in the form of the bi-weekly nevada snapshots), and was quite impressed by it’s ZFS file system. So I wanted This meant using an Intel based machine, though (getting a Sparc based enclosure seemed to push my luck), so I started looking around again. Thecus offered a five disk hot-swap enclosure (the 5200(PRO)) with a 600MHz or 1.6GHz Celeron processor, Marvell-SATA-Controller and Intel Gigabit Ethernet, booting from an internal flash disk. While talking this through with several people on IRC (thanks, ofu!) it became clear that I could get more performance for the same money when building the system myself. (Ironically, I have gotten my hands on a Thecus 5200 based system as well, so I get to have the best of both worlds. Life is great, sometimes). I did not want a midi or mini ATX tower though, so choice was getting slim. ofu again pointed me towards the Chenbro 340 case, which has four hot swap cabable SATA bays and takes a Mini-ITX board. Finding a fitting board turned out to be somewhat complicated, as Mini-ITX boards with (at least) 4 SATA ports are rare (in theory it is possible to put a PCI card into the case using a riser card, but I did not want to go that way). In the end I chose the MSI IM-GM45, which has four on-board SATA ports connected to an Intel ICH9M-E controller, two Intel gigabit ethernet ports and takes an Intel Penryn processor (among others). It also has an IDE connector, which will drive the boot disk (the four SATA drives will run in a RAIDZ configuration, from which Solaris can not yet boot). The final list of parts for this project is as follows:
The hard disks are already there (the two Seagates on the shelf, and the two Samsungs in the old enclosure). These will be resused. I also have several old notebook drives lying around, one of which will be used as the boot disk.
« previous page
(Page 2 of 2, totaling 30 entries)
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
