Voodai's Journal

Monday, July 24, 2017

HP Z820 Workstation & GTX 1080 - Will it work?

The short answer is: Not out of the box, but it will work with a very specific adapter.

The long answer is: To power a GTX 1080 in a Z820 Workstation a single 6pin-to-8pin PCIe power adapter will not be sufficient. There has still been some discussion about if it might work because there isn't all that much information about the PSU and how it's configured internally. Sometimes a 6-to-8pin adapter can work with certain PSU's that exceed the normal specifications of the 6pin connector - perhaps to enable use of such adapters. On the Z820 and the 1125w PSU it doesn't work.

The system boots fine and is usable until you stress the GPU at which point it will promptly bluescreen. To be fair, on paper it should not work because a 6pin connector provides only 75w and the PCIe bus gives you another 75w - amounting to 150w. The GTX 1080, on the other hand, needs about 170-180w under full load. So, how does one even use a >150w card on the Z820? You will need something like this

A dual 6pin-to-8pin adapter, but unfortunately they are about as easy to find as pink unicorns (try eBay?). These provides you with 225w of power (75+75+75w) and will make your card run stable with 35-45w overhead. Nvidia does not recommend these adapters due to "grounding requirements" of these power hungry cards - as such you will not find these being sold by Nvidia nor will you get one with your GTX 1080 purchase. These adapters do work, so if you're out of options and are willing to risk a configuration that might not be covered under warranty then a dual 6pin-to-8pin adapter is pretty much your only option. These adapters are probably difficult to find because most people would buy a new PSU to fulfill the GTX 1080's requirements. As a Z820 owner, however, you don't have that luxury because of the non-standard PSU formfactor.

Saturday, July 22, 2017

CentOS 7 & GTX 1080

CentOS 7 does not currently support Nvidia Pascal GPU's out of the box so if you are running into issues with the graphical installer then it's a safe bet that it's because you have a Pascal card in your machine. You can work around this by using the text-based installer but partitioning isn't as full-featured so you might be forced to use the automatic partitioning option.

Once CentOS is installed you can simply download the proprietary Nvidia drivers - which work quite nicely I might add - and be up and running with your desktop in no time. As long as you know how to navigate the commandline until you get the proper drivers installed then it shouldn't be much of a challange.

I have a GTX 1080 in my workstation and the graphical installer hung while systemd was initializing. You'd think at least VESA graphics would work on Pascal GPU's but it doesn't - at least not currently. Hopefully the CentOS installation disks will be updated at some point to at least include rudimentary GTX 10XX support.

Saturday, January 21, 2017

List of maker boards

Ever since the Raspberry Pi became popular several new ARM development boards have hit the market, some even predate the Pi but gained more popularity as the Pi paved the way into mainstream. It's getting to the point where it's difficult to keep track of all the new offerings so this post is a reference list for myself which i plan on updating as new interesting devices become available. It only lists hacker friendly mainboards and not ARM devices in general. It would be fun to add some MIPS based boards too - some do exist, but the ARM based devices tend to be more interesting and popular. I list only the models with the highest spec available.

With the performance boost of the Raspberry Pi 2 & 3 most alternative maker boards loose their appeal - but the various outputs or the presence of gigabit ethernet still makes some alternatives into better options. Recently Udoo has caught my attention as a project with interesting ideas and their recent x86 model has great potential for real-world deployment where the Raspberry is too limited - digital signage springs to mind. I've known about Udoo for years but now that their x86 board is on it's way I think it has excellent potential even though it's a lot more expensive than the Pi.

Board	URL	CPU	Cores	GPU	RAM
Raspberry Pi 3	Website	ARM Cortex A53	4@1.2GHz	Broadcom VideoCore IV	1GB
Beagleboard	Website	ARM Cortex A8	1@1GHz	PowerVR SGX530	512MB
Pandaboard	Website	ARM Cortex A9 MPCore	2@1.2GHz	PowerVR SGX540	1GB
ArndaleBoard	Website	ARM Cortex A15	2@1.7GHz	Mali-T604	2GB
Wandboard	Website	Freescale i.MX6 Quad	4@1GHz	Vivante GC 2000	2GB
Udoo	Website	Freescale i.MX6 Quad	4@1GHz	Vivante GC 2000	1GB
HummingBoard	Website	Freescale i.MX6 Quad	4@1GHz	Vivante GC 2000	2GB
pcDuino3	Website	AllWinner A20 (ARM Cortex A7 Dual Core)	2@1GHz	Mali-400MP2 (Dual Core)	1GB
Banana Pi	Website	AllWinner A20 (ARM Cortex A7 Dual Core)	2@1GHz	Mali-400MP2 (Dual Core)	1GB
Cubieboard	Website	AllWinner A20 (ARM Cortex-A7 Dual Core)	2@1GHz	Mali-400MP2 (Dual Core)	2GB
OLinuXino	Website	??	??	??	??
APC Rock	Website	??	??	??	??
Origenboard	Website	??	??	??	??
ODROID	Website	??	??	??	??
CuBox-i	Website	??	??	??	??

I'll keep updating this.

Friday, January 6, 2017

How to force a downgrade of the BIOS on a HP Z Workstation

Important note: I've only done this on a HP Z820 so I cannot guarantee that it's possible on other models.

I ran into a seemingly rare problem when i upgraded to the latest BIOS on my Z820 Workstation - Intel AMT stopped working. I didn't notice this at first so I could only assume the BIOS upgrade screwed things up. My machine originally had a really old original 1.x BIOS on it and I only did one intermediate (and mandatory) BIOS update before I jumped to the very latest version and I think this huge leap between versions might have contributed to the breakage - that along with the fact that I hadn't flashed the latest Intel AMT firmware along with the newer BIOS. To my dismay the HP download page said that BIOS downgrades were no longer allowed and sure enough - you really can't downgrade - not even when flashing from inside the BIOS. It was sloppy of me not to check this beforehand.

So what was I to do? I tried everything I could think of and finally discovered how to get around this and in the end it wasn't even all that difficult, obvious even. You see, HP Z Workstations allows you to bridge some pins on the motherboard to put the machine in emergency recovery mode (look it up in the manual) to rescue the computer from a bad BIOS flash. In emergency recovery mode you bypass the BIOS entirely and use the bootblock to load an old BIOS image from a USB stick - one that doesn't prevent downgrades which then, in turn, allows you to flash any BIOS version. The emergency recovery procedure is well documented. It probably didn't occur to the HP engineers that the bootblock should be able to prevent older BIOS's from loading because that in itself would involve risks if a new and badly flashed BIOS along with the bootblock prevented you from going to an earlier (working) version.

The procedure is as follows:

Bridge the emergency recovery pins on motherboard (check manual for instructions).
Prepare a USB stick with an older BIOS that doesn't prevent downgrades - this is very important. Only the last couple of versions prevent downgrades so you won't have to go too far back. Changelogs will note if BIOS no longer allows downgrade.
Boot your workstation with USB stick connected to a USB port, the emergency recovery mode will now start the computer seemingly like normal - except the older BIOS is now running off your USB stick and not the one flashed to your motherboard. Pay attention during boot to see which BIOS version is loaded as you boot.
Download and flash any BIOS version you like once booted into your operating system, effectively downgrading the BIOS.
Unbridge pins on motherboard.

As far as I know the BIOS still needs to be a signed, original HP BIOS and I assume the bootblock verifies this. Still, I haven't tried any non-official firmware so I cannot say for sure.

Anyway, Intel AMT started working again with the latest BIOS after I made smaller, incremental, updates between versions and somewhere in between I managed to flash the latest Intel AMT firmware so this little trick solved my problem. I can't remember all the BIOS versions I jumped between to get AMT up and running again but it shouldn't be difficult to figure out - just keep downgrading to older BIOS's until you find one that makes it work again and then do upgrades a bit more carefully - and don't forget to check the device manager to make sure the Intel Management Engine is still working. Regardless of what HP wants you to think you will not break the machine by going from the very latest BIOS to a much older one - older BIOS's worked just fine for me. There are also other reasons you might want to downgrade the BIOS - some applications are certified to run on very specific BIOS versions so this way you can downgrade a new BIOS to one that is certified for an application that requires it. Or maybe a newer version was just unstable? That's something I've experienced on occasion.

HP insist that "you cannot downgrade" but considering that others might run into similar problems I figured this trick could be important to share. What's interesting is that there is no way for HP to actually prevent you from doing this in the future unless they're willing to also update the bootblock and they are notoriously resistant to do this - and for good reason, because it opens up the possibility of permanently bricking a machine in the event of a bad flash of said bootblock. If, however, they did release bootblock updates with version checking many would at the same time rejoice as newer bootblocks would finally allow early revision motherboards to support v2 Xeons, which they technically can do but are not allowed to by the older 2011 bootblock. Still, my guess is that HP will never allow bootblock updates and this also means is that they can never stop you from downgrading the BIOS regardless of what they claim.

Thursday, January 5, 2017

HP Z820 Workstation Fan Inventory

I've been considering a complete fan replacement on my HP Z820 Workstation but there are some issues to be aware of. First of all there is a whopping 12 fans in total and HP uses a non-standard OEM pinout, which makes the modifier numbers especially important. The second issue is that different workstation revisions use different fans - some fan shrouds seem to be equipped with Delta fans only while others are a mix of Delta, Nidec and AVC. Surprisingly the model numbers used on individual fans is poorly documented, presumably because owners are expected to buy an entire fan shroud replacement if one or more fans go bad or a kit of front or back facing fans. If the fans go bad on the power supply you are probably meant to get a new one. When you have an expired warranty, options become a bit more limited so I find it far more compelling and cost effective to just buy the individual fans and replace them myself, and to do that you'll need to know exactly what models that work.

My goal here is to - first and foremost - identify all the fans, including their positions, in my own workstation and then all alternatives that can work as drop-in replacements. The HP Z820 seems to be able to use the exact same fans as the newer Z840 and vice versa. A number of these fans should work in the Z600/Z400-series and the older Z800. I cannot confirm this, however, as I don't have a Z800 so my main focus will be on the Z820.

Why bother with this? Delta fans, for example, have a good reputation but are unfortunately known to be noisy so owners with those might be interested in the quieter Nidec options - especially if you do audio work and need a quieter system. Those of us with Nidec fans might be interested in replacing them with Delta's for other reasons - perhaps the better longevity, price and availability are deciding factors. I wanted to document the cooling fans to make things easier for people who want to do the replacement themselves so having the exact model information should keep your workstation within HP's specifications (CFM, RPM etc.). The fan numbering scheme i use in the images is arbitrary. I think information about alternative fans to those i happen to have is somewhat complete but I have no way of knowing if I've identified all variants HP has used in these workstations. I don't have all modifier numbers yet but I will update this post when/if I do.

Update: I went with the option of replacing the entire shroud because the combined shipping costs and rarity of some fans was making things difficult. I got a great deal on eBay on a brand new shroud for the dual cpu option.

The Shroud

The Case

The Powersupply

FAN1, FAN2 & FAN4

  Dimensions: 60mm (Blower-style) 4-pin
  Model: Delta BUB0712HF
  Modifier number: -BE04
  HP P/N: 670051-001 REV 1

FAN3

Note: The AVC fan in my shroud doesn't seem to have a HP product number which suggests that the generic model might have the same pinout. I did manage to find the PN from other sources, however.

  Dimensions: 60*60*25mm 4-pin PWM
  Model: AVC DS06025B12U
  Modifier number: P063
  HP P/N: 670050-001 Rev.A

  Alternative Delta Model
  Dimensions: 60*60*25mm 4-pin PWM
  Model: Delta AFB0612EH
  Modifier number: ??
  HP P/N: ?? (PN Required for HP OEM model)

FAN5, FAN6, FAN9 & FAN10

Note: The Nidec fans are rated at ~38db while Delta's are around ~45-49db and both run at 3800 RPM. Some sources suggest the Nidec fans use sleeve bearings while others that it's a ball bearing - I'm pretty sure they're sleeve bearing. The 92mm Nidec's have a MTBF of 45.000h which is nearly equal to the 50.000h you are likely to get from the Delta. I could not find the specifications for these exact OEM fans so these are my best guesses based on the specifications of the non-OEM equivalents of the Nidec and Delta fans. In other words, I cannot guarantee the above statements about the bearing, MTBF and noise profile to be 100% correct.

  Dimensions: 92*92*25mm 4-pin
  Model: Nidec T92T12MS3A7-57A03
  Modifier number: 2223G
  HP P/N: 647113-001 REVA (or REVB)

  Alternative Delta Model
  Dimensions: 92*92*25mm 4-pin
  Model: Delta QUR0912VH
  Modifier number: -AK59
  HP P/N: 647113-001 REV 0A

  Alternative Nidec Model (New revision that should work)
  Dimensions: 92*92*25mm 4-pin
  Model: Nidec FAN A T92T12MS3A7-57A03
  Modifier number: ??
  HP P/N: 644315-001 REVB

  Alternative Delta Model (Used in Z800 that should work)
  Dimensions: 92*92*25mm 4-pin
  Model: Delta QUR0912VH
  Modifier number: -8C2T
  HP P/N: 468763-001

FAN7 & FAN8

Note: FAN7 & FAN8 come as a pair with wires going to the same proprietary 6-pin connector

  Dimensions: Dual 92*92*25mm fans to one 6-pin
  Model: Nidec FAN B T92T12MS3A7-57A03
  Modifier number: 2429H F4 / 4X30H G4
  HP P/N: 644315-001 REVB

  Alternative Delta Model
  Dimensions: Dual 92*92*25mm fans to one 6-pin
  Model: Delta QUR0912VH
  Modifier number: -BL3H
  HP P/N: 644315-001 REV 0B (or 0A)

FAN11 & FAN12

Note: The powersupply contains two identical Delta fans of the below model, these were found in the 1125W PSU and are likely also to be found in the less powerful 800W PSU. Given that Delta builds these I doubt any other fans models are used.

  Dimensions: 80*80*25mm 4-pin
  Model: Delta QUR0812SH
  Modifier number: -HE00
  HP P/N: None

Friday, January 4, 2013

Was udev a bad idea? Not at first.

I'm not an expert when it comes to device handling on Linux but I have written udev rules in the past (since 071 and earlier) and do have some familiarity, and at the time udev was first introduced it looked like an excellent solution. In fact, I felt it was a welcomed change from devfs. Not so long ago time came to upgrade to 105, and boy was I surprised - these days udev really looks like shit.

It sure doesn't seem like udev's design has scaled very well. For instance, udev rules you wrote even 6 month ago are unlikely to work with the most recent version because they keep changing the syntax every few releases. Why would stuff like $modalias suddenly be changed to a much uglier $env{MODALIAS}? I'm curious to know why anyone would favor such a syntax.

Also, udev 071 consisted of a fairly small set of binaries, but later releases grew to pretty much one helper program for each device class, and you are now expected to do stuff like copy pre-made symlinks and device nodes manually into the /dev directory. If anything, wasn't that supposed to be udev's job in the first place? A nice improvement is that udev 071+ completely replaces hotplug for the 2.6 kernel - which is perhaps the reason for all the helper programs.. I'm not entirely sure. I still think it could have been done a lot cleaner, though.

udev rules used to be something that could be written in a very clean and comprehensible style, but now you'll get a headache just by looking at basic examples. Quite frankly I'm starting to dislike it more and more and sometimes find myself wishing for it to just go away. After looking at FreeBSD and how clean /dev is kept I'm starting to wonder if it really was such a great idea to abandon devfs all together.

Update: The above is an old post. Since then udev was merged into systemd - a rat's nest of a init system that I'm forced to use daily, fortunately the eudev fork exists.

Friday, September 28, 2012

Good practices - A plea to open source projects

I compile, configure and test the software used on mainstream Linux distributions. I do this on a private distribution (for personal use) and at the moment it consists of slightly more than 1000 (no, this is not a typo) build scripts, which I have written myself. I maintain the init scripts and all package builds in the entire Linux distribution software stack - everything from the kernel, glibc, to the compiler toolchain all the way up to Xorg and KDE. I even wrote a scripted package manager to aid me in my task and automate the process, I haven't switched to it yet but plan to eventually. Needless to say, I know my way around Linux fairly well. Because I package so much of the Linux software stack I also know many of the dirty little secrets we hide from end users behind our crafted packages. Users are often oblivious about how much trouble package maintainers have to go through just to get their favorite application compiled, packaged and ready to be executed.

About 90% of the software projects I encounter are well behaved, well structured and well designed. It's that last 10% I spend 90% of my time with trying to get up and running, for whatever reason. This is what this post is all about - saving time and frustration.

My beef with the current state of open source/free software is the departure from traditional UNIX software engineering principles. Software projects today seem to ignore stuff like FHS (File Hierarchy Standard) and basics of the Unix Philosophy. One rule is - write simple parts connected by clean interfaces. I see this and other principles violated left and right, and blatant violations of FHS are not uncommon. Read the FHS, trust me - it won't take long. This will allow you to understand the beauty of the UNIX filesystem layout, there is practically a directory assigned for everything you might think of, and once you understand it you'll appreciate how elegant it is. People with a background on Windows systems rarely understand nor appreciate why the directory layout looks the way it does on UNIX-like operating systems but once you do I'm sure you'll grow to love the logic behind it. At first glance it looks cryptic, but there's a reason for that - and the benefits are still there today in spite of it's long history. It isn't as complex as you might think. For example, if you've written a library then it belongs in /lib or /usr/lib (depending on what kind of library it is, a core os library would belong in /lib while a less crucial library belongs in /usr/lib). Does your server program need to write data somewhere? Have a look at /var. Is your application a desktop application and need to write data somewhere? Do it in a dot directory somewhere in the users $HOME path. Do you have static data that should be availabe system-wide? Stick it in /usr/share. That's what the FHS is for - it describes the filesystem and what all these directories are meant for.

There are brilliant programmers out there who write excellent programs but some of them haven't taken the time to understand these basics, and thus their programs, brilliant though they might be, behave in undesirable ways. It isn't exactly fun to dig through someones code and patch it to conform to the FHS. For people who package software this makes your application high-maintenance. A piece of advice to software engineers - take a long hard look at the build scripts and .spec files used by distributions to build and package your program, if it's riddled with hacks just to get it packaged then you should probably consider simplifying and correcting the build framework.

If you're using the GNU autotool framework to build your software then make sure it respects all the standard configure switches, and more importantly make really sure that DESTDIR works as it's supposed to, you'd be surprised how many software projects that neglect this particular feature, but it's essential for packaging the program. For aestethic (and some practical) reasons - avoid CamelCase in program names, don't use version numbers in paths - and if you need to then opt for /usr/include/app/$VERSION rather than /usr/include/app-$VERSION, we have pkg-config for a reason - it's handy for locating package-specific directories, and simplifies management of multiple versions of a program or library on the same system - no need to clutter the top level /usr/include directory.

Use relative rather than absolute symlinks for "make install", you don't want symlinks to point to the wrong files which is what will probably happen if your application resides on a mounted filesystem, and if it doesn't then the symlink will be broken. It also makes chroot jails easier - although I wouldn't expect anyone to design their application with this in mind. In the end it's just cleaner and better to always use relative symlinks.

Do NOT change the API unless you also change the major version number. Be patient, and if you desperately want to make changes to your API then wait until you are ready to release the next major version. Why? Because that's what we (downstream) are expecting - API changes in major version changes. Let me clarify why. For example, lets assume your library is widely used by other projects, but you descide to push a major API change in a point release. Oblivious to this change downstream updates to the new point release thinking it's API compatible with the previous point release you made. It will take time before we realize a dozen other applications no longer compile due to the change. Now we have to downgrade and recompile once more. I'd love to tell you that we always read ChangeLog's but sadly some of us just don't have the time.

Do NOT use unstable, in-development libraries and core component. This causes endless misery due to constant breakage. Wait until the API and design has stabilized and please, please don't make dependencies to svn/git/cvs versions. Unless stable tarballs have been released then don't hook your code into it's API. Gnome has always been a real PITA because they do this, and I've occasionally had to pull code from CVS just to get "just the right version" to make it work. I get it - you want the latest and greatest features but please, do not torment downstream with dependencies to unreleased code.

Digitally sign your code, ALWAYS. Make this a habit. While most people who download your code probably won't verify the signature you'll be glad you signed it if your code distribution site is hacked. It happens, and when it does it's one hell of a task to figure out what was tampered with.

On a more personal note I'd want to encourage projects to stick to a single compression format for releases rather than making three or four identical tarballs compressed with different algorithms that need to be signed independently. My suggestion? Stick to gzip. It decompresses faster than any other algorithm and while it's very inefficent when compared to LZMA2 (.xz) you'll only need to do the one tarball and just about everyone will be able to decompress it. A couple of years from now xz might be widespread enough to justify the switch.

ALWAYS include a license. Preferably as file named a COPYING or LICENSE in the top-level directory. A tarball without one is like getting a wrapped present that has a suspicious ticking noise. Sure, it could be just a harmless clock but it could also be a bomb. By including a license you make your intentions clear so we don't have to worry about getting sued for packaging and distributing your program. Use a common and widely used license - legalese isn't always easy to understand so by using a common (and scrutinized) license we can be sure there aren't any hidden surprises behind the wording.

Describe your software and include an official URL for the project. As strange as this might seem a lot of projects neglect these simple details. A short paragraph in a README file is more than enough and it allows us to figure our what it's for because it isn't always obvious. It's hard to identify the purpose of program or library by a name such as libsnx. By also including an official project URL you'll also save us a lot of time trying to search for the upstream distribution website when it's time to update and we can then be sure we've got the right project. Often there are a lot of alternative programs to perform a given task.

Modularity is great. It's encouraged by standard UNIX practices, but as a project grows and becomes increasingly more complex modularity tends to show it's limitations. It isn't always practical to keep 100 modules in sync on the code level, especially if you start making API changes.

Off the top of my head, that's about it for now.