Ray Donnelly has been working on merging his OSX cross compiler patches into the crosstools-ng project, which should allow us to more easily build his compilers from source instead of relying on a binary blob.
Jotting down Ray's instructions in case anyone wants to try this:
On Linux it is possible now to build the equivalent of what toolchain4 provided (for OSX only - iOS still has some build issues)
Check the differences between samples/i686-apple-darwin11 and samples/i686-apple-darwin10 too, the main one being that 10 copies the sysroot into the final build whereas 11 doesn't. This means you need to pass --sysroot to the compilers built with 11 but in theory don't need to for the 10 ones (of course 10 is set up to use MacOSX10.6.sdk while 11 is set up to use MacOSX10.7.sdk).
We've been adding clang to the project most recently which has caused us to disabled llvm-gcc for now and the addition of clang isn't ready yet (it is nearly there but I am going on holiday for 2 weeks after this week so I'm not sure if we will get it finished before then - likely not).
So, all the above instructions are still valid; go with commit 7d555f284b6977e64640a30bcec77597580d3049 if you can. Any problems give me a shout. FWIW, llvm/clang is only now coming on-line for Darwin software anyway, and vanilla gcc-4.2 (i.e. not even llvm-gcc-4.2) is much more reliable for building OSX software. Seems Apple focussed more on iOS than OSX for clang.
Let me know how it goes. We've started to submit some patches upstream to crosstool-ng this last week, but I have a feeling it could be a drawn out process. We will regularly merge the other way the meantime though.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items ...
Show closed items
Linked items 0
Link issues together to show that they're related.
Learn more.
Mike, one question:
Is there a reason for using i686-apple-darwin11 in the current setup when compiling and not i686-apple-darwin10 given that we want to support OSX10.6 as the minVersion (and are therefore using the 10.6 SDK anyway)?
Oh, and while I am at it: Could you send me Ray's email address? I already encountered bugs while trying to compile the toolchain that seem to be fun bugs :)
Mike, are you using llvmgcc or normal gcc from toolchain4 at present? If normal gcc, then I'd recommend that you guys build from the latest of the cctools-llvm branch to get the latest fixes (llvmgcc is disabled at present, IMHO it was never very reliable, even the official Apple binaries). If you update to the latest then clang can be built now, however this is very untested so far and if you look in the TODO file in the root folder you'll get a better picture of some things that may still be problematic with clang.
Georg, there's no reason to use darwin11 instead of darwin10. When I started this work it was with a darwin11 SDK so it's purely a matter of habit for me. The merge to crosstool-ng is being done by Yann Diorcet and myself; I handle the darwin11 samples and he does the same for the darwin10 ones.
A compiler built with the darwin11 SDK (MacOSX10.7.sdk) can be used fine to build software for darwin10 using -mmacosx-version-min=10.5, However feel free to base your version on i686-apple-darwin10 (e.g. flosoft's MacOSX10.6.sdk) and the i686-apple-darwin10 sample instead.
I recommend studying the two crosstool.config files:
samples/i686-apple-darwin10/crosstool.config and samples/i686-apple-darwin11/crosstool.config
.. these are just samples of course. You could make your own, e.g. darwin-gitian. The final compiler prefixes don't depend on the sample folder name but rather the options specified in the crosstool.config file itself.
Some options that should be noted are:
CT_DARWIN_SDK_PATH="${HOME}/MacOSX10.7.sdk"
.. this is where you placed your OSX SDK.
CT_DARWIN_COPY_SDK_TO_SYSROOT=n
.. setting this to y means that some headers and dylibs from the OSX SDK are copied into the installation prefix. Doing this means you do not need to pass -sysroot to the compiler on the commandline, with the down-side being that you'd not be able to distribute the built toolchain due to copyrights and the OSX SDK license.
CT_LLVM_V_3_3=y
.. specifies the version of LLVM (and clang) to use. LLVM is compiled into cctools for LTO support and also, obviously for clang. Toolchain4 used LLVM 2.7 which is quite old at this stage. 3.3 is the latest release. I'm more interested in modernising things than using old versions (though reliability trumps all other concerns).
CT_DEBUGGABLE_TOOLCHAIN=n
.. expect to use 20GB if you set this to y. The built tools (and also those installed) are built at -O0 -ggdb and left un-stripped.
I think it would be sensible for me to setup exactly the same environment you are using. Is this Ubuntu? If so, can you point me to the exact ISO and any scripts you use to prepare it for development? I will then make a new VM.
If you've got some logs (build.log) detailing the "fun bugs" please post them.
Something else I should mention is that the old GCC in the crosstool-ng port doesn't yet support the -arch and -Xarch flags (whereas toolchain4 does).
This is because those flags are actually handled by driverdriver.c which is an Apple-specific program that's compiled outside of the normal GCC/llvmgcc compilation and when run calls the real GCC once for each arch specified (using the standard -m32 or -m64 flag). It then calls lipo to merge the objects for each arch together into a fat object/binary.
I'm not sure how the tbb build system for Mac invokes gcc; I would hope it is easy to change it from using "-arch i386" to "-m32" and/or "-arch x86_64" to "-m64", of course, if you build fat binaries in a single invocation then this is a bigger problem.
Having said that, adding the driverdriver program is high priority and something I expect to do tomorrow.
Finally, clang 'natively' supports these flags but I'd be (pleasantly) surprised if switching tbb to build with clang was trivial.
Mike, are you using llvmgcc or normal gcc from toolchain4 at present? If normal gcc, then I'd recommend that you guys build from the latest of the cctools-llvm branch to get the latest fixes (llvmgcc is disabled at present, IMHO it was never very reliable, even the official Apple binaries).
Normal gcc and yes, I am using the latest cctools-llvm branch.
Georg, there's no reason to use darwin11 instead of darwin10. When I started this work it was with a darwin11 SDK so it's purely a matter of habit for me. The merge to crosstool-ng is being done by Yann Diorcet and myself; I handle the darwin11 samples and he does the same for the darwin10 ones.
A compiler built with the darwin11 SDK (MacOSX10.7.sdk) can be used fine to build software for darwin10 using -mmacosx-version-min=10.5, However feel free to base your version on i686-apple-darwin10 (e.g. flosoft's MacOSX10.6.sdk) and the i686-apple-darwin10 sample instead.
That was actually the reason why I asked. We use already flosoft's MacOSX10.6.sdk and I hoped to avoid using the 10.7 SDK to get a compiler to use that one with the 10.6 SDK... Good.
I recommend studying the two crosstool.config files:
samples/i686-apple-darwin10/crosstool.config and samples/i686-apple-darwin11/crosstool.config
.. these are just samples of course. You could make your own, e.g. darwin-gitian. The final compiler prefixes don't depend on the sample folder name but rather the options specified in the crosstool.config file itself.
Thanks I'll think about doing an own while I digest that whole system.
I think it would be sensible for me to setup exactly the same environment you are using. Is this Ubuntu? If so, can you point me to the exact ISO and any scripts you use to prepare it for development? I will then make a new VM.
The gitian build system is working with (at least) the current LTS (12.04). I order to build the Mac TBB you do the following
again until you have all the necessary sources (i.e. until |make prep| does not give you any errors anymore)
Then do
./mkbundle-mac.sh
and it should give you (after a while) Mac TBBs in a new directory (currently 3.0-alpha-4). Anyway, if you have issues with setting it up. Just write me a mail and we'll sort these things out.
If you've got some logs (build.log) detailing the "fun bugs" please post them.
So I have found so far three issues (disclaimer: I am not sure yet if there is anything you can do about as this might actually be things belonging to the source packages used in building the compiler):
I hit intermittent GMP configure failures due to newly created files being older than distributed ones. (log: cross_mac_gmp1) That is likely due to the gitian build environment but it would be good to somehow avoid that. I am wondering whether the reason here is some old version of GMP? I have been building mingw-w64 from source for a while with a similar gitian environment but never hit this problem.
'groff' was missing (the check in the configure script is there but although 'groff' is needed for building the script is not returning with an error if it is not available) (cross_mac_build1). Installing the necessary package probably solves the issue.
a Makefile.config(?) is missing? (cross_mac_build2).
Now, what bothers me most, though, is that Firefox 24 is not working with a gcc4.2 anymore (at least Mozilla is saying that in the configure script and I don't have a reason to doubt that). I am under the impression that your cross-compilers are (by default) gcc4.2, no? If so, what are our options here? Are there non-interactive ways to change the gcc version?
The gitian build system is working with (at least) the current LTS (12.04). I order to build the Mac TBB you do the following
{{{
git clone https://git.torproject.org/builders/tor-browser-bundle.git
cd tor-browser-bundle/gitian
make prep
}}}
Install all the stuff you are asked to and run
{{{
make prep
}}}
again until you have all the necessary sources (i.e. until |make prep| does not give you any errors anymore)
Then do
{{{
./mkbundle-mac.sh
}}}
and it should give you (after a while) Mac TBBs in a new directory (currently 3.0-alpha-4). Anyway, if you have issues with setting it up. Just write me a mail and we'll sort these things out.
I forgot to add that the relevant files for cross-compiling the TBB code are gitian-tor.yml and gitian-firefox.yml in tor-browser-bundle/gitian/descriptors/mac. This should give you a hint on how your cross-compilers are currently used in the gitian build setup. If you have questions about that as well, drop me a mail.
Mike, are you using llvmgcc or normal gcc from toolchain4 at present? If normal gcc, then I'd recommend that you guys build from the latest of the cctools-llvm branch to get the latest fixes (llvmgcc is disabled at present, IMHO it was never very reliable, even the official Apple binaries).
Normal gcc and yes, I am using the latest cctools-llvm branch.
Georg, there's no reason to use darwin11 instead of darwin10. When I started this work it was with a darwin11 SDK so it's purely a matter of habit for me. The merge to crosstool-ng is being done by Yann Diorcet and myself; I handle the darwin11 samples and he does the same for the darwin10 ones.
A compiler built with the darwin11 SDK (MacOSX10.7.sdk) can be used fine to build software for darwin10 using -mmacosx-version-min=10.5, However feel free to base your version on i686-apple-darwin10 (e.g. flosoft's MacOSX10.6.sdk) and the i686-apple-darwin10 sample instead.
That was actually the reason why I asked. We use already flosoft's MacOSX10.6.sdk and I hoped to avoid using the 10.7 SDK to get a compiler to use that one with the 10.6 SDK... Good.
I recommend studying the two crosstool.config files:
samples/i686-apple-darwin10/crosstool.config and samples/i686-apple-darwin11/crosstool.config
.. these are just samples of course. You could make your own, e.g. darwin-gitian. The final compiler prefixes don't depend on the sample folder name but rather the options specified in the crosstool.config file itself.
Thanks I'll think about doing an own while I digest that whole system.
I think it would be sensible for me to setup exactly the same environment you are using. Is this Ubuntu? If so, can you point me to the exact ISO and any scripts you use to prepare it for development? I will then make a new VM.
The gitian build system is working with (at least) the current LTS (12.04). I order to build the Mac TBB you do the following
Should I use the 32bit version of the LTS?
{{{
git clone https://git.torproject.org/builders/tor-browser-bundle.git
cd tor-browser-bundle/gitian
make prep
}}}
Install all the stuff you are asked to and run
{{{
make prep
}}}
again until you have all the necessary sources (i.e. until |make prep| does not give you any errors anymore)
Then do
{{{
./mkbundle-mac.sh
}}}
and it should give you (after a while) Mac TBBs in a new directory (currently 3.0-alpha-4). Anyway, if you have issues with setting it up. Just write me a mail and we'll sort these things out.
If you've got some logs (build.log) detailing the "fun bugs" please post them.
So I have found so far three issues (disclaimer: I am not sure yet if there is anything you can do about as this might actually be things belonging to the source packages used in building the compiler):
I hit intermittent GMP configure failures due to newly created files being older than distributed ones. (log: cross_mac_gmp1) That is likely due to the gitian build environment but it would be good to somehow avoid that. I am wondering whether the reason here is some old version of GMP? I have been building mingw-w64 from source for a while with a similar gitian environment but never hit this problem.
In general, how does gitian handle filestamp dependencies? I'm under the impression that it uses .so injection to force specific time functions to return the same thing, given that, I'm a little confused. Is there a mechanism to force some fixed negative delta onto files created by specific tools (tar and patch come to mind)?
'groff' was missing (the check in the configure script is there but although 'groff' is needed for building the script is not returning with an error if it is not available) (cross_mac_build1). Installing the necessary package probably solves the issue.
Ok this sounds like an upstream crosstool-ng problem. I will put it on the TODO list for now.
a Makefile.config(?) is missing? (cross_mac_build2).
Now, what bothers me most, though, is that Firefox 24 is not working with a gcc4.2 anymore (at least Mozilla is saying that in the configure script and I don't have a reason to doubt that). I am under the impression that your cross-compilers are (by default) gcc4.2, no? If so, what are our options here? Are there non-interactive ways to change the gcc version?
4.2.1 was as the last GCC that Apple provided. They heavily patched this version (and made llvmgcc from it) and they never updated beyond that (understandable given the level and nature of patching, of course they could've opted to make their changes cleanly and worked with GCC core developers to upstream but I suspect GCC was something they worked on grudgingly). They switched from GCC to clang, so when you say that Firefox 24 is not working with gcc4.2 anymore, I can only assume that for Mac, they switched to building with clang - or maybe llvmgcc, if you can determine the exact details of this it would be helpful, also can you find out it requires libc++? When do you plan to switch over to Firefox 24?
I will download your git repo, give it a test and study your logs at lunchtime. You didn't say whether you use the -arch flag or not, but no problem I can take a look.
Just for reference: Even though the build (i686-apple-darwin10) is much smoother outside the gitian setup it still does not go through on a clean Ubuntu VM, see output in cross_mac_vm.
I added two logs, the first is actually targeting ARM (iPhone), so you probably should ignore that one for now.
Was your log complete? In mine, "[INFO ] Installing cctools for host" doesn't happen until line 42686 whereas it's the first line in your log. I'm guessing you snipped your log to avoid size limits and so that it can be referred to via URLs? I think having your complete log would be helpful if this is the case.
From your log:
"[CFG ] configure: WARNING: llvm-config (/home/firefox/x-tools/i686-apple-darwin10/bin/i686-apple-darwin10-llvm-config) not found, will use LLVM 2.7 defaults" is interesting. It indicates that LLVM wasn't built or installed correctly.
I added two logs, the first is actually targeting ARM (iPhone), so you probably should ignore that one for now.
Was your log complete? In mine, "[INFO ] Installing cctools for host" doesn't happen until line 42686 whereas it's the first line in your log. I'm guessing you snipped your log to avoid size limits and so that it can be referred to via URLs? I think having your complete log would be helpful if this is the case.
From your log:
"[CFG ] configure: WARNING: llvm-config (/home/firefox/x-tools/i686-apple-darwin10/bin/i686-apple-darwin10-llvm-config) not found, will use LLVM 2.7 defaults" is interesting. It indicates that LLVM wasn't built or installed correctly.
Attached is the full build log (cross_mac_vm_full.tar.bz2).
That does not matter as gitian is creating own VMs for building.
[snip]
In general, how does gitian handle filestamp dependencies? I'm under the impression that it uses .so injection to force specific time functions to return the same thing, given that, I'm a little confused. Is there a mechanism to force some fixed negative delta onto files created by specific tools (tar and patch come to mind)?
Mike, could you comment on that. I am not sure about the details...
Now, what bothers me most, though, is that Firefox 24 is not working with a gcc4.2 anymore (at least Mozilla is saying that in the configure script and I don't have a reason to doubt that). I am under the impression that your cross-compilers are (by default) gcc4.2, no? If so, what are our options here? Are there non-interactive ways to change the gcc version?
4.2.1 was as the last GCC that Apple provided. They heavily patched this version (and made llvmgcc from it) and they never updated beyond that (understandable given the level and nature of patching, of course they could've opted to make their changes cleanly and worked with GCC core developers to upstream but I suspect GCC was something they worked on grudgingly). They switched from GCC to clang, so when you say that Firefox 24 is not working with gcc4.2 anymore, I can only assume that for Mac, they switched to building with clang
or maybe llvmgcc, if you can determine the exact details of this it would be helpful, also can you find out it requires libc++?
Will see what I can find.
When do you plan to switch over to Firefox 24?
As soon as possible I think. TBB with Firefox 24 should be ready if the ESR 17 reaches its EOL (which is IIRC in about 7-8 weeks).
I will download your git repo, give it a test and study your logs at lunchtime. You didn't say whether you use the -arch flag or not, but no problem I can take a look.
I've had an initial look and here's what I've gleamed:
There's currently an issue which forces us to need to clean out both the builddir and any existing installation before doing subsequent builds in the same directories. For you this would mean doing:
You're currently basing your build on the darwin10 sample, but actually you'd be better off starting from my darwin11 one (see note [1]) so I'd replace samples/i686-apple-darwin10/crosstool.config with samples/i686-apple-darwin11/crosstool.config and then, to turn it back into a darwin10 config, change the following entries:
CT_DARWIN_MAC_OSX_V_10_7=y
to:
CT_DARWIN_MAC_OSX_V_10_6=y
CT_DARWIN_VERSION="11"
to:
CT_DARWIN_VERSION="10"
CT_DARWIN_SDK_PATH="${HOME}/MacOSX10.7.sdk"
to:
CT_DARWIN_SDK_PATH="${HOME}/MacOSX10.6.sdk"
.. if you want to be able to use the generated toolchains without passing --sysroot $MacOSXSDKDir then change (I don't recommend this - see note [2]) change
CT_DARWIN_COPY_SDK_TO_SYSROOT=n
to:
CT_DARWIN_COPY_SDK_TO_SYSROOT=y
.. depending on whether you would want to debug issues if/when we run into them change (likely worthwhile for now, provided you've got ~20GB to spare)
CT_DEBUGGABLE_TOOLCHAIN=n
to:
CT_DEBUGGABLE_TOOLCHAIN=y
Notes:
[1] I suspect I maintain my darwin11 samples a bit more regularly, whereas Yann generates new ones each time via make menuconfig. Also my samples will always use the latest versions of things that are in a reasonably working state (LLVM and clang 3.3 in this case)
[2] I don't recommend this as it's not a use-case I have verified as working, specifically, AFAIK the frameworks are not copied over so building anything that requires more components than a C and C++ library is likely to fail. Further, toolchains built like this cannot be legally distributed.
You're currently basing your build on the darwin10 sample, but actually you'd be better off starting from my darwin11 one (see note [1]) so I'd replace samples/i686-apple-darwin10/crosstool.config with samples/i686-apple-darwin11/crosstool.config and then, to turn it back into a darwin10 config, change the following entries:
I did that but am running in the same error as in cross_mac_build2 (the Makefile.config issue mentioned above). I attached the whole log (cross_mac_build3.tar.bz2) in case it helps.
Are you running it via gitian now? The reason I ask is that the unpacking stage of the log is full of:
[FILE ] gmp-5.1.1/gen-fac.c
[FILE ] tar: gen-fac.c: time stamp 2013-02-11 15:29:14 is 413911753.757075149 s in the future
.. clearly 2013-02-11 is in the past (given a sane, non-US date ordering at least)
Then the LLVM build itself has lots of:
[ALL ] make[4]: Warning: File `/home/ubuntu/build/crosstool-ng/ct-ng-final/.build/i686-apple-darwin10/build/build-LLVM-host-i686-build_pc-linux-gnu/Makefile.llvmbuild' has modification time 0.81 s in the future
[ALL ] llvm[4]: Regenerating /home/ubuntu/build/crosstool-ng/ct-ng-final/.build/i686-apple-darwin10/build/build-LLVM-host-i686-build_pc-linux-gnu/Makefile.config
.. so it's constantly re-generating the Makefiles and doing other "maintainer mode" build steps that shouldn't need to be done with a released tarball.
Yes. This bug is about getting everything running in gitian which is currently my main concern (or better my main concern is getting Fx 24 ESR cross-compiled for Mac which is actually #9829 (closed)). I am happy to test with a clean VM without gitian, though, (to debug other/related issues) but that needs to be done in a different bug (not sure where crosstools-ng has its bugtracker).
Yes. This bug is about getting everything running in gitian which is currently my main concern (or better my main concern is getting Fx 24 ESR cross-compiled for Mac which is actually #9829 (closed)). I am happy to test with a clean VM without gitian, though, (to debug other/related issues) but that needs to be done in a different bug (not sure where crosstools-ng has its bugtracker).
I agree that gitian is the end result here, however having crosstool-ng working outside of gitian is a necessary first step toward that result. If you feel this step merits a ticket of its own then that's fine, please go ahead and create one.
I am happy if the bug report lives on this tracker, but https://github.com/diorcety/crosstool-ng/issues would arguably be a better place for it. I can try to create one there if you prefer?
I am happy if the bug report lives on this tracker, but https://github.com/diorcety/crosstool-ng/issues would arguably be a better place for it. I can try to create one there if you prefer?
As soon as my build machine is ready again I'll try your suggestions in a clean VM outside of gitian and report an issue in the crosstool-ng bugtracker (if there are any, that is).
.. and entered "now - 413911773 seconds" and it said:
"Monday, 28-Aug-2000"
gnutar will extract files with the date stamp as embedded in the archives; having these dates 13 years in the future is likely to cause the build failure you are seeing. I think you need to fix your date at "the-date-of-the-most-recent-tarball-that-you-use + 1 second" or something like that. Mike's input on this would be very useful.
Of course if you've got solutions to any of these issues that require options being added to crosstool-ng then I can try to make those changes and get them merged when everything is working.
That said, even if there is a lot of regenerating going on I currently don't see why Makefile.config should be missing because of this. Thus, I am not sure whether dealing with the faketime issue is sufficient to solve this particular problem.
Gitian relies on libfaketime to set the clock to a fixed value to deal with embedded timestamps in archives and in the build process.}}}See: https://blog.torproject.org/blog/deterministic-builds-part-two-technical-details and see: https://github.com/wolfcw/libfaketime/blob/master/README for how libfaketime works (basically via LD_PRELOAD; and we are setting FAKETIME to "2000-01-01 00:00:00")
Forgot to add that we do
{{{
find -type f | xargs touch --date="$REFERENCE_DATETIME"
in source directories where "$REFERENCE_DATETIME" is currently "2000-01-01 00:00:00". That might be a thing worth trying regarding the crosstools-ng sources...
That said, even if there is a lot of regenerating going on I currently don't see why Makefile.config should be missing because of this. Thus, I am not sure whether dealing with the faketime issue is sufficient to solve this particular problem.
There's continual regeneration of Makefile.config in the presence of recursive make (-j5). So I can easily imagine that at various points Makefile.config does not exist. Removing and re-writing a file is not a multithread-safe atomic operation.
I added an option for this feature:
From menuconfig's perspective it is called "Date and time to set on all source files"
From config file perspective it's called CT_SRC_REFERENCE_DATETIME.
I also added it to samples/i686-apple-darwin11/crosstool.config as: CT_SRC_REFERENCE_DATETIME="1999-12-31 23:59:59"
It would be good if you could pull the latest, try this out and see where things stand.
Okay. The build failed. See cross_mac_build4.tar.bzip for the full log. Gitian gave me (additionally it seems) the following messages:
[00:00] / [INFO ] Forcing date and time of patched sources to 1999-12-31 23:59:59[00:00] / touch: cannot touch `space/test.h': Not a directorytouch: cannot touch `space/test.h.result': Not a directorytouch: cannot touch `space/test2.m.in': Not a directorytouch: cannot touch `space/test1.m.in.result': Not a directorytouch: cannot touch `space/test1.m.in': Not a directorytouch: cannot touch `space/test2.m.in.result': Not a directory[ERROR] [00:00] / [ERROR] >>[00:00] / [ERROR] >> Build failed in step 'Forcing date and time of patched sources to 1999-12-31 23:59:59'[00:00] / [ERROR] >> called in step 'Extracting and patching toolchain components'[00:00] / [ERROR] >> called in step '(top-level)'[00:00] / [ERROR] >>[00:00] / [ERROR] >> Error happened in: main[scripts/crosstool-NG.sh@651]
Ah, I had locally disabled clang from my config, and it the "find | xargs touch" command doesn't correctly handle spaces in paths (which one of the clang test-suite files contains).
I'm now giving gitian a go. First thing I noticed is that:
sudo torsocks apt-get install
.. is very noisy:
07:54:57 libtorsocks(933): The symbol res_query() was not found in any shared library. The error reported was: not found!
07:54:57 libtorsocks(933): The symbol res_search() was not found in any shared library. The error reported was: not found!
07:54:57 libtorsocks(933): The symbol __res_send() was not found in any shared library. The error reported was: not found!
07:54:57 libtorsocks(933): The symbol res_querydomain() was not found in any shared library. The error reported was: not found!
.. should I be concerned about this?
Also, you've been trying to build crosstool-ng under gitian, is that just a question of prepending "sudo torsocks" in front of all the crosstool-ng build commands? If you have a new script that you can commit then that would be good.
I'm now giving gitian a go. First thing I noticed is that:
sudo torsocks apt-get install
.. is very noisy:
07:54:57 libtorsocks(933): The symbol res_query() was not found in any shared library. The error reported was: not found!
07:54:57 libtorsocks(933): The symbol res_search() was not found in any shared library. The error reported was: not found!
07:54:57 libtorsocks(933): The symbol __res_send() was not found in any shared library. The error reported was: not found!
07:54:57 libtorsocks(933): The symbol res_querydomain() was not found in any shared library. The error reported was: not found!
.. should I be concerned about this?
No.
Also, you've been trying to build crosstool-ng under gitian, is that just a question of prepending "sudo torsocks" in front of all the crosstool-ng build commands?
No. 'torsocks' is just used to download the dependencies via Tor.
If you have a new script that you can commit then that would be good.
Let me know when you are ready (i.e. the VMs are up and you built a Mac bundle successfully). I'll send you a patch which you can apply on top of your local branch and which should give you the environment I am currently using.
That said, I've been running now into three different issues which are likely to due to the gitian build process: 1) the linker got killed 2) not enough space on the VM 3) another build failure for yet unknown reasons. I fixed 1) and 2) and am currently debugging 3).
But your latest two commits fixing bashisms seemed to work. I'll test that on my Ubuntu VM again as soon as my build machine is not occupied with gitian build stuff and will comment on the github bugtracker then.
It was last night when I last tried but I remember getting something about needing to do:
export USE_LXC=1
.. because AFAIR it said VM-in-a-VM wouldn't work so to do this instead? Is that the case? Are there some settings I can apply to make VM-in-a-VM work?
Good questions. But, honestly, I don't know the answers to them. I never tried that setup as building without nested VMs is already slow enough for me :)
Trying with a native Ubuntu 12.04.3 now, I'm getting:
Formatting 'target-lucid-i386.qcow2', fmt=qcow2 size=11811160064 backing_file='base-lucid-i386.qcow2' encryption=off cluster_size=65536
ssh_exchange_identification: read: Connection reset by peer
ssh: connect to host localhost port 2223: Connection refused
ssh: connect to host localhost port 2223: Connection refused
Lucid i386 VM build failed... Trying again
I will let it try again then I'll try "make vmclean && make build".
Let me know when you are ready (i.e. the VMs are up and you built a Mac bundle successfully). I'll send you a patch which you can apply on top of your local branch and which should give you the environment I am currently using.
Although I've not been able to make a build yet I think having your patch to hand would be good.
The build is still failing although after 4 hours of compiling I feel I am close. Seems to be some files are missing (although the first error is probably not the culprit here) (see cross_mac_gcc_errors.tar.bz2).
Some things to note:
I still hit one error of type 3) in comment 4. Thus, your FAKETIME fix probably needs to be tuned (somehow) a bit more.
I got another presumably related thing tonight: all 4 CPUs run with 100% for hours but nothing got compiled (I had to abort this one)
Not sure if
CT_DEBUGGABLE_TOOLCHAIN=n
is really working for me as I am already running my builds with it but I had to bump the size of the VM already twice. I am now at 32GB and the latest build was already at 26GB when it was failing.
4) The gitian build log gives me hard to read output of the whole compiling process :)
That happens only when using crosstools-ng (and the log output of the actual compiling process looks similar). Not sure if there is anything you can do about, though. But it would be nice if this could be fixed.
That said, I'll attach a patch set with instructions on how to use it with your local gitian environment later today.
I imagine I can turn off the /-\ animation for you easily enough; investigating ..
Please do attach the patch. Also, I emailed you & Mike 3 patches I had to apply to get the toolchain4-based tbb-mac to build. If you can consider them I'd appreciate it.
If you add:
CT_LOG_PROGRESS_BAR=n
.. to your crosstool.config file then the logs should be cleaner.
Thanks, I'll try that the next time.
Attached is the patch. It is against fc14747c24c87085f6ca502d1643dd2e7a0ea7a0. Additional things you need to to:
put your crosstool.config file into gitian-builder/inputs
run fetch-inputs.sh again
in gitian-builder/inputs/crosstoll-ng you need to checkout the revision you want and create a tag (this is currently 'test5' which is mirrored in gitian/versions
You need to raise the max_size of the precise VM. I am currently trying 32GB (32768 as value to --rootsize). Maybe this is enough (I have debug output in the crosstool.config disabled). You do that with gitian-builder/bin/make-base-vm and the --rootsize paramter in line 76.
Now, remove both preciseqcow2 files in gitian-builder
Remove the tor-browser-mac32-gbuilt.zip in gitian-builder/inputs
sudo dpkg -i apple-uni-sdk-10.6_20110407-0.flosoft1_i386.deb
sudo: no tty present and no askpass program specified
Sorry, try again.
(I tried entering my login password but I'm guessing there's some other password that we need.. does the gitian VM not just do everything as root anyway?)
Ok I managed to work around this, but the things I had to do were horrible:
I decided that I needed my (inside the VM) user to have sudoer access without passwords because the actual build commands are run with redirection for logging so the can't use the tty to ask for passwords anyway.
In order to be able to edit the /etc/sudoers in the VM I had to:
Patch gitian-builder with 0002-use-ssh-t-for-commands-run-on-target-via-sudo.patch (attached - please read the comments on this it is likely to break some other things, basically, if sudo appears in $* then I add -t to the ssh command).
Then with this done, run on-target sudo nano /etc/sudoers to add "Defaults:ubuntu !authenticate" to it.
However, instead of trying to install the flosoft MacOSX SDK .deb file at build time, I think the following options are better:
Install the .deb at target VM creation time (as I'm guessing other packages are installed?)
Also, I've seen various references to /usr/lib/i686-apple-darwin*, I think these are leftovers from before Mike switch to using toolchain4. These toolchains are fully relocatable so none of this should be needed.
Well, ignore all of that, somehow I'd messed up my sshd config locally so while I though I was connecting to the VM, I was actually connecting to my host machine!
Good news! Building at -j1 allowed me to complete a gitian build of crosstool-ng.
To force this, in crosstool.config you need to add:
CT_PARALLEL_JOBS=1
It seems a shame to forgo parallelism since everything built ok (to the best of my knowledge at least) up to the last thing, which is actually the legacy GCC 4.2.1, not clang. I will compare build artefacts to try to isolate what exactly causes this. FWIW, gnumake outputs warnings:
make[5]: warning: Clock skew detected. Your build may be incomplete.
.. forcing a specific date then generating tools (gengtype) that generate build artefacts that (Makefile-wise) depend on the build tool that created them is in the realm of unspecified behaviour I expect.
Could you check the bits relating to libfaketimeMT.so.1? It seems to me that this may be safer to use, but given our use pattern (basically always return a constant value), likely it is a red herring, also, I don't think gnumake is multithread (sure pthreads and fork use sys_clone at the lowest level, but when forking no MT protection would be required. I'm not sure why gnumake links to pthreads to be honest). It's just a thought and I will do a test build later to see if it makes any difference.
I've also started to examine the behaviour of gnumake more closely as I've noticed that during the build, a file that should be built only once is getting built 8 times (and overall, the log from a gitian build is about 2* the length of one from a non-gitian build) From my initial prodding, it seems that somehow make has an un-faked timestamp for things it has just created, but this could be down to a problem with my test-case setup and that wouldn't cause the 8 times rebuild either.
I've concluded that libfaketime is somewhat broken for the use case that gitian build requires.
Some problems I've seen include:
The fake time values are not setup correctly for early calls (at least in the MT version any that happen between semaphore creation and the getenv("FAKETIME")/getenv("FAKETIME_FMT") calls).
Caching functionality is broken; it doesn't respect the "FAKETIME" env. var, merrily replacing it with "+0" instead as soon as it decides that the cached values are too old. This means that long running processes stop faking time properly early in their execution.
nano-second values are never touched for the stat family of functions - or likely any that call fake_time - and gnumake uses nanoseconds for dependency checking. Without fixing this, the dependency timestamp check could be replaced with "rand()&1" (Mike's blog post mentions this in point 4 of "Remaining Build Reproducibility Issues [Millisecond and below timestamps are not fixed by libfaketime]")
I've got a messy patch for these issues that I can share?
Also, libfaketime doesn't actually effect the real filestamps as written to any files, it only intercepts attempts to query these values.
I've concluded that libfaketime is somewhat broken for the use case that gitian build requires.
Some problems I've seen include:
The fake time values are not setup correctly for early calls (at least in the MT version any that happen between semaphore creation and the getenv("FAKETIME")/getenv("FAKETIME_FMT") calls).
Caching functionality is broken; it doesn't respect the "FAKETIME" env. var, merrily replacing it with "+0" instead as soon as it decides that the cached values are too old. This means that long running processes stop faking time properly early in their execution.
nano-second values are never touched for the stat family of functions - or likely any that call fake_time - and gnumake uses nanoseconds for dependency checking. Without fixing this, the dependency timestamp check could be replaced with "rand()&1" (Mike's blog post mentions this in point 4 of "Remaining Build Reproducibility Issues [Millisecond and below timestamps are not fixed by libfaketime]")
I've got a messy patch for these issues that I can share?
Sure. Best would probably be to file these issues upstream (or check whether they are already there). I can do both. Now, what are our options here? Accept building with
CT_PARALLEL_JOBS=1
until these issues are fixed? Trying to get the clang compiler working (which we probably need to to anyway, see #9829 (closed)) and hoping we are getting away with that?
Yes for now I recommend sticking with -j1, and in-fact even though I think these are 'good' fixes within the remit of libfaketime, I suspect that they may turn occasional failure in gitian into deterministic failure!
.. this is just speculation at this point (it could actually make things work with -jN). To know for sure, and so that I can apply fixes needed elsewhere in the build system I need to build a .deb for libfaketime with my patches and roll that into the gitian descriptors. My .deb building skills are non-existent so I'd appreciate help with that.
.. I will try to make the changes needed to make use of it. I am a bit confused about that though, I could make fetch-inputs.sh download it from my dropbox and then install it, but I wondered if there was a way to place it in the base VM images? Also, is Vagrant used?
If you want to try my faketime then apply 0001-Use-my-faketime_0.9.6-test-package.patch
It should be applied after your patch, but there's some intermediate patches I've not included (just debugging info) so apply this with "patch -p1 < 0001-Use-my-faketime_0.9.6-test-package.patch".
.. I will try to make the changes needed to make use of it. I am a bit confused about that though, I could make fetch-inputs.sh download it from my dropbox and then install it, but I wondered if there was a way to place it in the base VM images?
Your .deb is introducing the GMP failures for me again (see comment 6 case 1)) while this is not the case with the unpatched libfaketime anymore. I don't know why this is happening yet. What I did was downloading, verifying and copying your version to gitian-builder/inputs. Then I changed gitian-firefox.yml: 1) I removed faketime in the packages section and added your .deb to the files section 2) I put
sudo dpkg -i faketime_0.9.6_i386.deb
at the first place in the script section. Then I restarted the build. Not sure why you want to have your patched faketime package in the base VM. That shouldn't buy you anything.
Also, is Vagrant used?
Building with it should be possible again with the latest trunk if that is what you mean.
I was more wondering out loud what Vagrant was about, I guess it's a VirtualBox based replacement for the KVM based system? Is that roughly correct?
Your .deb is introducing the GMP failures for me again (see comment 6 case 1)) while this is not the case with the unpatched libfaketime anymore. I don't know why this is happening yet.
I had this happen too. So this comes down to the fact that configure checks to see if "the build system is sane" where by sane it means that a file it just created is newer than a file from the $srcdir (it does this by using ls -t and checking that the first line is the new file). When using a libfaketime that correctly freezes time (including nanoseconds), this is not the case. This is what I meant when I said:
even though I think these are 'good' fixes within the remit of libfaketime, I suspect that they may turn occasional failure in gitian into deterministic failure!
.. I need to figure out a good way to fix this; replacing all instances of "ls -t" in configure* before performing the crosstool-ng build is an option here. I'd like to get Mike's take on this. FWIW, I had comments back from the maintainer of libfaketime and he's going to implement nanosecond freezing (as opposed to my nanosecond zero'ing).
Having said all this, we need to get ESR 24 to build so I'm proposing to move onto concentrating on that instead.
I can't build the toolchain anymore it seems. At least not with gitian. See the attached log (gcc_build_error.log). Can you verify that? I am basically using crosstool.config_real3 (found in #9829 (closed)) and rev 9a711e316a9c374f815c3d018dd2614fea2382d5.
I suspect that the problem is to do with libfaketime and the random nano-second values, meaning that the determinism we're looking for is not there yet.
This is the build procedure I had success with just now:
untar your Mac OSX 10.6 SDK to $HOME (you should have $HOME/MacOSX10.6.sdk - if it's one with broken absolute links like some are you'll need to fix those links, but the one you've been using in the past should be fine).
Default prefix is /usr/local, pass --prefix= if you want to install it elsewhere.
./bootstrap && ./configure && make
sudo make install
mkdir build
cd build
/usr/local/bin/ct-ng x86_64-apple-darwin10
/usr/local/bin/ct-ng build
Wait an hour or two. Libfaketime can make this several hours!
The cross compilers should be installed to $HOME/x-tools/x86_64-apple-darwin10/
Regarding libfaketime, I ran into an infinite loop bug after I corrected the bug where it didn't change the nanosecond field of the time structure (which I'm guessing upstream have since fixed too), and also an autotools issue where it reports "checking whether build environment is sane .. no". This mail from Eric Blake to the GNU make bugs mailing list touches on both these subjects from the Austin Group's perspective (though not directly) http://lists.gnu.org/archive/html/bug-make/2014-08/msg00044.html. I don't think fixing these issues would be too much effort, but unless they are fixed I'd recommend a different approach to deterministic builds involving post processing object files instead.
mingwandroid: Is there a way to avoid downloading all the dependencies every time we build the cross-compiler? Can I somehow point crosstools-ng to already downloaded sources and convince it to use those instead? Ideally, there should be no need for a network connection at all in the build phase (after fetching all the sources + installing the necessary packages).