Running nspire_emu on OS X without a VM

One of the most heavily used application I have is nspire_emu. I use it for testing my programs and debugging. Up until today, I used a VM to run the program.

I decided that ~1GB of RAM to run a small emulator was a little too much – not to mention other overheads. I decided to install Wine and get the emulator to run on OS X without a VM.

Unfortunately, there were some minor issues to getting it to run.

  • Wine buffers stdout. This means messages from the serial port weren’t being showed on the screen until the buffer was full.
  • The easiest way to fix it was to modify the sources to output on stderr instead.
  • The binary the mingw cross compiler produced on my Mac kept crashing. Yet the original binary from the same source worked fine.
  • The binary also had garbage titles and menu options.

Turns out the crashing was caused by a dodgy hack for quickly dumping an array. The array was casted to a va_list which worked for the original compiler but not for my mingw compiler. This was fixed easily.

The garbage in the titles and menu options was caused by the resources file being incorrectly compiled by my cross compiler. I fired up my VM, installed a native mingw system, compiled it from there and brought the (now correct) object file across and plugged it into the Makefile.

And of course, I changed every printf, putc call to a fprintf(stderr, etc..) call so Wine doesn’t buffer output.

I put everything together nicely into a Applescript bundle which launches Wine which launches the patched emulator. I’ve made this available for others who want to use it and included a README too.

nspire_emu_mac

Posted in blog | 1 Comment

bFLT format – implementation notes

One of the biggest show-stoppers in writing a bFLT file loader for Ndless was the lack of documentation with the bFLT file format. Information was scattered everywhere and it was hard to find how to implement some specific feature. I decided to write a simple description of all the features in bFLT and how to implement them.

The first thing I should mention is that there are two versions of bFLT out there. I will be describing the simpler, version 4 format.

First, we have the file header structure:

#define	FLAT_VERSION			0x00000004L

struct flat_hdr {
	char magic[4];
	unsigned long rev;          /* version (as above) */
	unsigned long entry;        /* Offset of first executable instruction
	                               with text segment from beginning of file */
	unsigned long data_start;   /* Offset of data segment from beginning of
	                               file */
	unsigned long data_end;     /* Offset of end of data segment
	                               from beginning of file */
	unsigned long bss_end;      /* Offset of end of bss segment from beginning
	                               of file */

	/* (It is assumed that data_end through bss_end forms the bss segment.) */

	unsigned long stack_size;   /* Size of stack, in bytes */
	unsigned long reloc_start;  /* Offset of relocation records from
	                               beginning of file */
	unsigned long reloc_count;  /* Number of relocation records */
	unsigned long flags;
	unsigned long build_date;   /* When the program/library was built */
	unsigned long filler[5];    /* Reservered, set to zero */
};

#define FLAT_FLAG_RAM    0x0001 /* load program entirely into RAM */
#define FLAT_FLAG_GOTPIC 0x0002 /* program is PIC with GOT */
#define FLAT_FLAG_GZIP   0x0004 /* all but the header is compressed */
#define FLAT_FLAG_GZDATA 0x0008 /* only data/relocs are compressed (for XIP) */
#define FLAT_FLAG_KTRACE 0x0010 /* output useful kernel trace for debugging */

For simplicity, all fields are in network byte order (i.e. big endian) and are 32 bits wide. When you write your loader, make sure you correctly convert the endian appropriate for your machine.

Loading Executables

The magic field should be the bytes “bFLT”. The rev field should be set to FLAT_VERSION. It’s advisable to check these fields before loading the binary.

The next 3 fields are all file offsets. If your system doesn’t differentiate between code and data, you can load the text and data sections all in one reading since the sections are one after the other in the file. It is assumed that text_start = entry, text_end = data_start, data_end = bss_start.

Although the bss_end field implies that a bss section exists in a file, it actually doesn’t. You need to make sure that (bss_end – data_end) bytes are available and zero’d at the end of the executable image.

The total executable image size can be calculated by (bss_end – entry).

The stack_size field is pretty much self-explanatory.

The flags field is a bitfield of flags. The FLAT_FLAG_* definition comments describe the meaning of the flags.

Relocations

There are two kinds of relocation you may need to do. The first is relocating global and static pointers and the other being fixing up the GOT.

Relocating global and static pointers

If reloc_count is more than 0, it means that there are pointers that needs to be fixed up.

The reloc_start field is a file offset to a list of pointers (note, they are still big endian!). These pointers point to pointers in the executable image. These pointers are zero-based offsets into the executable image (exceptions being shared library references – explained later).

If you’re implementing XIP and you’re keeping code and data separate, you can work out which section the offsets are pointing to by taking the pointer and comparing it to the sizes of each section. I.e.

if (offset < (data_start - entry)) {
    //pointing to code section
} else if (offset < (data_end - entry)) {
    //pointing to data section
} else if (offset < (bss_end - entry)) {
    //pointing to bss section
}

In most cases, you can derive the real address of the offsets by adding the executable image base address to the offset (obviously a little more work is required for XIP executables).

Deference the pointer and you get another offset (this time, it’s the native endian for the target). This is an offset you need to fix up. In most cases you can add the executable image base image to it and store it back. With XIP, you will need to work out how you can get the absolute address for the offset (not very difficult).

After you’ve gone through all reloc_count offsets, you’re done for this part.

GOT relocation

If your flags indicate that your program is PIC with GOT, you need to do some additional fixups.

At the start of your data section is a list of pointers terminated by a pointer value of -1. They need to be fixed up in the same way you fixed up relocation above.

Entry point

After all the relocation has taken place, you can simply jump to the base of the code section to run the executable.

Shared libraries

Shared libraries are equivalent to executables in bFLT. The executable linkage to the shared library is determined at compile-time rather than runtime.

Shared libraries are identified by numbers between 0-255. 0 and 255 are reserved.

During relocation, the high byte of the offset determines the library needed to complete the linkage.

For example, if the offset needed fixing during relocation was 0x030003a0, the fix up for that offset would be a pointer pointing to the offset 0x03a0 into the library identified by the number 0×03.

You would find the library associated with the number in the high byte, load it into memory, find the absolute address of the offset into the library and replace the fixup with that.

Using the same example, if I had a executable loaded into memory at 0×1000 and a library with an ID of 3, loaded into memory at 0×2000, the fix up for 0x030003a0 would be 0x23a0.

When a shared library is loaded, I believe you also need to enter its entry point to initialize the library before the main binary runs.

Shared libraries are, otherwise, loaded and treated exactly the same as a normal executable.

Posted in blog | Leave a comment

A movie player for NSpire. Finally.

I’ve been struggling for quite a while now on the best way to implement a movie player for the NSpire calculator. nPlayer was also born around the same time and I realised there were other’s trying to do the same thing as I. I had to gain an edge over them. I immediately identified a few things I could improve in nPlayer. First, nPlayer was closed source and secondly, it didn’t have decent video compression. From there, I set out on my journey to create a good movie player for the NSpire.

The first version I made was a failure. It worked but took way too much storage. I recall it was in the tens of megabytes for just over a minute.

Since then, I have been looking for ways to reduce the amount of storage needed for a video file. I looked at using run length encoding but I realised that if the video didn’t have simply the same data repeated all the time (which is basically all the time), run length encoding would actually take more storage than if I hadn’t used it.

At one point, I looked at porting libmpeg to the NSpire. After several failed attempts, I finally had something that ‘probably’ works. But I realised the converting process is too complicated. The library needs a pure mpeg2 stream and thats not so easy to produce. Eventually I gave up on that.

Next, I took a look at using zLib to compress the frames. This, however, only reduced sizes by around 10 percent maximum and that’s still not good enough.

After a long break, I realised that with videos, a lossy compression was needed or else the size wouldn’t reduce by much. And so, I decided to look at motion jpeg. However, I wasn’t successful to get it working on the NSpire.

Finally, I compromised. I decided to give up on porting a real video codec while still using lossy compression. I remembered that jpeg files compress extremely well and I decided I could concatenate a whole series of jpeg files together into a single file and have them uncompress on the spot as it was being played.

All that’s left was to find a library that would decode jpeg files. I found a one-shot C library that did exactly that. I would feed it jpeg data and it would spit out a decoded frame.

All that’s left is to process the buffer and convert the colors and draw them out.

And voila, that is how Nspire Movie Player was born!

Posted in blog | Leave a comment

My house goes IPv6

Many of you would know that the last block of IPv4 addresses were allocated last year. Slowly, companies are migrating to IPv6 for their servers and computers.

I felt that it was about time to jump on the bandwagon. Migrating to IPv6 was actually a lot easier than I had imagined it to be.

First up, I needed an IPv6 capable router – my 5 year old wireless router didn’t support it. I chose the Netgear WNDR3700v2 since it had IPv6 support and support for 3rd party firmwares.

Since my ISP hadn’t offered native IPv6, I chose to implement IPv6 through using a 6to4 gateway. The gateway address is an anycast address which automatically routes to the closest 6to4 gateway.

6to4 gives you a massive /48 IPv6 subnet just for your home! The first 16bits is the IPv6 prefix tells IPv6 routers that it’s a 6to4 address. The next 32bits is your IPv4 address. The remainder is up to you to allocate.

The Netgear has built in support for using 6to4 gateways so I didn’t have to use a 3rd party router firmware. I simply enabled it and the IPv6 firewall and off I went. All my IPv6 compatible Mac machines automatically configured itself with a IPv6 address and I was able to connect to IPv6 hosts.

Keep in mind that IPv6 addresses are directly routed and you don’t sit behind a NAT any more. All your IPv6 computers will be directly exposed to the internet. Make sure you have some appropriate IPv6 firewall rules in place.

Thanks to 6to4 gateways, I was easily able to get online with an IPv6 address and my house is officially IPv6 enabled!

Posted in blog | Leave a comment

…And “rb” comes back and bites me on the ass again…

I was working on an ELF loader for Ndless and I kept wondering why my test suites kept crashing while the main program loader worked – even though they were loading the exact same files.

Yup, after an hour almost going insane, I found the problem. I was using fp = fopen("file", "r"); instead of fp = fopen("file", "rb");

Ouch,

Posted in blog | Leave a comment

Git is… pretty good

Recently, I’ve realized the world has gone Git. Most new projects I’ve seen are all hosted on Github. But, previously being a Subversion user, I really didn’t get what the hype about Git was.

So, one day, I decided to try it out. I dug around some tutorials hoping to find out how Git worked and finding the equivalent commands.

It was only after I grasped the concept of distributed source control that I really understood Git. It’s absolutely essential that you forget everything you know about Subversion. I’ll try and explain it in ‘Subversion-y’ terms.

Brief Explanation

Distributed source control is basically source control where everyone is the ‘repo’. There is no central server. This is a very important concept. Unlike Subversion, there is no central location to store your files. You are responsible for your own repo and files. Everyone has a complete history and can do whatever the hell they want to the files in their repo without affecting anyone else. There is no concept of synchronization. Everyone’s repo will be different (well, most of the time, they are).

So, how do we share changes? How do we contribute changes back to the project? If there’s no central server, how does it work?

In a Subversion repo, if you wanted to share changes, you had to get a login from whoever manages the repo and then submit your changes by committing to the repo.

In Git, it’s a little different. There’s no concept of having to “log in” to a repo. You have your own repo and it’s yours completely. If you wanted to share changes, you’d put your repo up in public. If people like your contributions, they’ll ‘pull’ your changes into their own personal repo.

Likewise, if you see an interesting change in someone else’s repo, you have the option of ‘pulling’ from them.

The whole distributed system works by you pulling changes from people you trust. There isn’t a central place to get things from. Once you understand this, Git is pretty easy to get.

The other thing that confused me with Git was the staging area. Basically, the staging area is just a list of files that you’ll be checking in. To commit your changes, you’d stage the files you want to commit and then you’d commit them.

The last major thing that confused me about Git is versioning. Git doesn’t store revision numbers. It stores snapshots. Git does things similarly to as if you had just duplicated your working directory. It’s important to realize that revision numbering would simply not be possible with a distributed system because everyone’s state of the repo is different.

So, now you’re probably wondering why I love Git so much.

1. Speed.

Git clones a repo MUCH faster than checking out in Subversion. Subversion downloads one file at a time while Git compresses the whole lot and downloads it all at once.

Not only is it doing it faster, but Git is also downloading the whole repo while Subversion is only downloading one revision.

Creating a repo under Git is also so much easier. With Subversion, you had to go and set up a server somewhere and configure users etc…

With Git, I literally run 3 commands and I have a repo ready 10 seconds later.

2. Branching

Git has amazingly designed branching that don’t scare people away. During my time with Subversion, I was actually scared of making branches because it seemed complicated and horrible to maintain and switch between.

Git makes it so easy. I can make and swap between branches as often as I want and it takes less than a split second.

3. It’s distributed

Because Git has no central server, I can commit changes from wherever. I don’t need internet access to commit changes.

The idea of not having to worry about backups is also a huge plus. Since everyone has a complete history of changes, it doesn’t matter if the server we use to host Git code on dies. Everyone can simply clone someone else’s repo and get started right away.

Of course, no central server equals not having to worry about usernames and passwords. It’s simply a matter of pulling from people you trust.

Conclusion

Although the concept of Git and distributed source control is a lot more complicated to grasp than Subversion, given it’s huge advantages, I think it’s definitely worth going Git.

I’ve begun using it for all my projects ever since.

Posted in blog | Leave a comment

There’s no “better” in technology

No seriously, there isn’t. I’m sick of people asking “which is better? X or Y?”.

With technology, there isn’t a “better” or “worse”. There is only “pros” and “cons” and personal preference. That’s what everything really boils down to anyway.

Think about it. If there really was something as “better”, why do competing products exist? If “Windows is better than Mac”, why do Mac users exist? If “iPhones are better than Androids”, why do Android users exist?

When people come and ask me “which is better? X or Y?”, it’s a good indicator that they’re really just buying whatever they’re asking because they can. It’s quite likely they’re just buying it to fit in or because they can. They don’t really need it. Some of the time, these people are close minded idiots who sees everything as black and white anyway. Your advice is probably not going to be appreciated anyway.

The people who know exactly what they want from their device are generally more open minded. They are the people I tend to give more comprehensive advice to. You can easily tell too. If they ask for pros and cons with two different platforms, you know they see things as more than just black and white and are people you can talk intelligently to.

And of course, you have the people who don’t care about what’s better, and just choose out of personal preference. And that’s totally fine. Because it seriously doesn’t matter what phone/gaming console/etc.. they use. It’s why alternatives exist – simply personal preference.

Likewise, if you ask someone, “Which is better? X or Y?” and they give an answer without telling you pros and cons or anything, you know that they are either: a fanboi or a close minded idiot. Probably shouldn’t listen to their advice.

So, the point of this rant is: There is no “better” in technology. Stop asking me what’s better. If you must ask me, ask intelligent questions like “What are the pros and cons of these two platforms?”

 

Posted in blog | 5 Comments

ELF loader for Ndless

So, a few weeks ago, I posted something about using C++ with the Ndless framework. Unfortunately, while the compiler worked, the end binary kept crashing. I figured that it was because of the C++ vtables having invalid function pointers. The linker wasn’t updating them correctly.

This was the same problem when doing static declarations to pointers. The following code crashes on Ndless.

void foo() {}
int main() {
    static void (*var)() = foo;
    var();
    return 0;
}

The reason is because of the way Ndless links and loads your programs. When you link it on your development machine, it sets a load address and everything is calculated on that. But with Ndless, you can’t load it to a static load address because you don’t know what address you’re going to get from malloc. So, we solve this by relocating.

The startup file that is compiled into Ndless binaries looks up at the Global Offset Table and fills in the correct data at run time. What is run first is PIC which gets its current location and the location of the GOT and patches it so our executable can run correctly.

Unfortunately, when we have statically allocated pointers, they are stored on the .data section of the executable. When it’s turned into a memory image for running on Ndless, all the symbol information telling the linker where each section starts and stops and where all the variables are gone. So, while the GOT may get updated with correct pointers, our statically allocated pointers don’t.

This is a fundamental problem with Ndless’s way of loading things. With this, I started working on a ELF loader a few days ago. An ELF file still contains all the required symbol definitions and the whole lot. That means, our ELF loader can patch everything completely for our program to run.

The code is still in it’s disgusting stage and is in need of a lot of polishing. If you would like to take a look, the code is hosted on Github https://github.com/tangrs/ndless-elfloader

Posted in blog | Leave a comment

Getting a C++ compiler for Ndless

The more I work on nspire-gamekit, the more I think a object orientated programming language would be more suitable for the task. I’m currently playing around with C++ and getting it to work with the ndless SDK (which doesn’t natively support C++).

It is very likely I will consider making the whole project object orientated since it models the problem of game development much better. Overall it can produce cleaner looking code (the current example game has loads of static variables it’s not even funny).

So, if there are people who want to use C++ on ndless code, here is how you can get g++ working alongside the ndless SDK.

First, (if you haven’t already), install newlib for the arm architecture. Download it or check it out from CVS, cd into the directory and configure it with ./configure --target=arm-none-eabi then the usual make and make install.

Next, we’ll need to compile g++ and have it work with ndless.

I configured my GCC installation with ../gcc-4.6.1/configure --target=arm-none-eabi --disable-libssp --disable-multilib --disable-nls --disable-threads --disable-shared --enable-languages=c,c++ --with-newlib --disable-newlib-supplied-syscalls. Again, make and make install.

If you have trouble with redefinitions of “signal.h” like I did, simply delete the signal.h file in /usr/local/arm-none-eabi/include and replace it with an empty file.

Now, you need to go into your ndless/bin folder and copy nspire-gcc to nspire-g++. In the new file, change the last few lines so they look like this:


GCC=`(which arm-elf-g++ arm-none-eabi-g++ arm-linux-gnueabi-g++ | head -1) 2>/dev/null`
# -fno-builtin: We prefer to use syscalls. And GCC's builtins expansion (http://www.ciselant.de/projects/gcc_printf/gcc_printf.html)
# is incompatible with the inline definition of most syscalls.
"$GCC" -mcpu=arm926ej-s -I "$DIRNAME/../include" -fno-exceptions -nostdlib -fpic -fno-builtin "$@"

Pay special attention to adding -fno-exceptions -nostdlib to the list of flags. We will be disabling built in functions and sacrificing being able to use exceptions since the ndless SDK doesn’t implement the required functions.

We also need to change the linker from GCC to G++ so the linker is capable of linking both C++ and C files. It also needs to insert some initialization logic for C++ into the final binary before main() is called.

Change your nspire-ld file so the last few lines look like this:


# some newlib symbols are not found if ld is used...
GCC=`(which arm-elf-g++ arm-none-eabi-g++ arm-linux-gnueabi-g++ | head -1) 2>/dev/null`
# lazy system build: must be built with the same toolchain
(cd "$DIRNAME/../system" && make -s all)
ret=$?
if [ $ret -ne 0 ]; then
exit $ret
fi
if [ $nostartup = false ]; then
[ $lightstartup = true ] && startupobj=crt0light.o || startupobj=crt0.o
args=("$DIRNAME/../system/$startupobj" "${args[@]}")
fi
# -nostartfiles: avoids newlib startup which would be added before ours
"$GCC" -nostartfiles -T "$DIRNAME/../system/ldscript" -e _nspire_start -L "$DIRNAME/../lib" -static "$DIRNAME/../system/crt0sym.o" "${args[@]}" -lndls -mcpu=arm9

Lastly, in your code, you have to implement the new and delete operators using the ndless malloc() and free(). The built in functions uses the stubs in newlib I think so we can’t use that.

Somewhere in your code, implement this code:

void *operator new(size_t size)
{
    return malloc(size);
}

void *operator new[](size_t size)
{
    return malloc(size);
}

void operator delete(void *p)
{
    free(p);
}

void operator delete[](void *p)
{
    free(p);
}

After all of the above modifications, your SDK should still work perfectly as before. You also now have a working C++ compiler!

Remember, you cannot use exceptions with this SDK since they require the use of unimplemented functions.

Posted in blog | Leave a comment

John Smith recently read these articles

A interesting trend I’ve noticed on Facebook recently is the “X recently read articles”. It shows up on the Newsfeed and the links of the articles read by friends.

I think it’s a very clever way to make money. Almost every time, the articles have a title that appeals directly to teenagers (which are the major users of social networking). The articles are always either controversial, involving sex or issues involving teenagers.

Of course, when they click it, they don’t even have to like or share the link to the article when they read it. As soon as they read it, it pops up on their friends Newsfeeds where the cycle begins again. In the end, the article links spread like a virus.

So what does it mean? It means the companies behind it all make a lot of money from advertising. They can drag in viewers and have the viewers share the link without the user having to do anything at all. On top of that all, the user consents to it (you consented automatically when you clicked “Add to Facebook” on the permissions page). It’s too convenient for the companies.

It also means our news articles that are published will be increasingly “dumbed-down” for teenagers. Instead of proper, intelligent articles, we’ll be seeing more “dumbed-down” and deliberately controversial articles targeted at teenagers with the intent of pulling more views.

I think, overall, it’s a pretty damn clever way to make some quick cash from ignorant teenagers. Evidence from my own Newsfeed shows that it really works. Even I have been tempted to click and read some of the articles. Well played.

Posted in blog | Leave a comment