Category Archives: neural networks

rewriting the trading algorithm

I have the script now running at a realistic 1.14% per day return, taking into account the average number of times that I’ve had to pay a trading fee, the average offset of what price a trade happens vs what price the algorithm wanted it to happen, and other unpredictable things.

The script as it is, is very hard to train in any way other than almost a brute force fashion. Because most of the variables (length of EMA long tail, number of bars to measure for ATR, etc) are integers, I can’t hone into the right values using a momentum-based trainer, so I need to do it using long, laborious loops.

There is one improvement that I could do, which would probably double or more the return, without drastically changing the script. At the moment, there is a lot of “down time”, where the script has some cash sitting in the wallet, and is waiting for a good opportunity to jump on a deal. If the script were to consider other currencies at the same time, then it would be able to use that down time to buy in those currencies.

On second thought, I think the returns would be much higher, because when the script currently makes a BUY trade, its usually sold again within about 30 minutes. That means that it could potentially make 48 BUY trades in a day, each with an average of maybe a .5% return. That’s about a 27% return ((1+(.5/100))48 = 1.27).

That would be nice, but it’s impossible with the GDAX market, because the GDAX market only caters to four cryptocurrencies at the moment. Also, while the money aspect is nice, I’m actually doing this more for the puzzle-solving aspect. I get a real thrill out of adding in the real-life toils and troubles (fees, unpredictable trades, etc) and coming up with solutions that return interest despite those. So, I’m not going to refine the hell out of the script. Instead, I’m going to start working on another version.

Before I started on the current version, I had made a naive attempt at market prediction using neural networks. While some of the results looked very realistic, none of them stood up to scrutiny. I failed. But, I also learned a lot.

I’m going to make another neural-net-based attempt, using a different approach.

The idea I have is that instead of trying to predict the upcoming market values, I’ll simply try to predict whether I should buy or sell. This reduces the complexity a lot, because instead of trying to predict an arbitrary number, I’m only outputting a 1 (buy), 0 (nothing), or -1 (sell). This can then be checked by running the market from day 1 of the test data using each iteration of the network, and then adjusting the weights of the network based on whether the end result was higher or lower than the last time it was run.

I noticed that the best tunings from my current script only made a very few trades, on the very few days where the market basically dropped like a stone and then rebounded. But I can’t assume those will happen very often, so I like to see a few trades happen every day. With the neural network, I can increase the number of trades by simply adjusting the output function that decides whether a result is -1, 0, or 1 (this is usually done by converting the inputs into a value between -1 and 1, then rounding to an integer based on whether the value is above/below 0.7/-0.7. That value can be adjusted easily to provide more 1/-1 hits).

The current approach also involves a lot of math to measure ATR (average true range), running EMA (exponential moving average), etc. With the neural network approach, I will just need to squish the numbers down to form a pattern between -1 and 1 (where 0 is the current market price) and run directly against those numbers.

Because neural networks are a lot more “touchy-feely” than the current EMA/stop-gain/stop-loss approach I’m using, I will be able to “hone in” on good values – something I cannot do at the moment because of the step-like nature of integers.

I won’t be using any off-the-shell networks, as I can’t imagine how to write a good trainer for them. Instead, I’ll write my own, using a cascade-correlation approach.

Cascade-correlation is an approach to neural networks which allows a network to “grow” and gradually learn features of the input data. You train a network layer against every single input, until the output stops improving. You then “freeze” that network layer so is not adjusted anymore. You then create a second layer which trains against every single input and the previously trained network. You continue adding layers until there is no more noticeable improvement.

The point of this is that the first training layer will notice a feature in the data that produces a good result, and will train itself to recognise that feature very well. The second layer will then be able to ignore that feature (because it’s already being checked for), and find another one that improves the results. It’s like how you decide what animal is in a picture – does it have 6 legs (level one), is it red (level two), does it have a stinger (level three) – it’s a scorpion! Instead of trying to learn all the features of a successful sale at once, the algorithm picks up a new one at each level.

Around Christmas, I was playing with the FANN version of cascade correlation, and I think it’s very limited. It appears to create its new levels based on all inputs, but only the last feature-detection layer. Using the above example, this would make it difficult to recognise a black scorpion, as it would not be red. I believe that ideally, each feature layer should be treated as a separate new input, letting the end output make decisions based on multiple parallel features, not a single linear feature decision.

idea for music recognition, conversion and composition using artificial neural networks

I had this idea while walking the kids to school. Starting from a simple network that can classify music styles as rock/metal/classical/folk/etc, I think that it would be possible to adapt the same algorithm to convert a music file from one style to another, and even write music from scratch in whatever style you want. And if I’m right, I think it would be very simple to write.

Recognition

This is the simplest task. To recognise the style of a music file, all you need is a feed-forward network with a few thousand inputs, at least one hidden layer, and one output for each style you want to recognise.

A standard data rate for recorded music is 160kbps. That means that every second, there are 10,240 separate wave heights (160*1024/16) that need to be examined. Of course, you can recognise music using lower bps values, but let’s use the same setting for the whole process (160 will be wanted for later parts).

So, the input layer would need 10,240*n inputs, where n is the number of seconds you want the network to sample in order to determine the style. In some cases (metal/classical), you may get away with sampling just a single second, but for better results, you might want a larger value. I’ll be setting n to 300, so it samples the entire song in most cases. This makes it easier to be accurate about the result, but will also be useful in a later stage.

The output layer needs to have one node per tag you want to measure. For example, you might have an output that measures how “rock” a song is, and another that measures how “baroque” it is. You could use output nodes that return a simple Yes/No result, but there is a good reason to return a more linear certainty instead (which we’ll get to).

The hidden network needs at least one neuron, obviously, but I don’t think there is any way to say exactly how many it needs, so it would be better to use a network model which grows automatically as it learns (I don’t know the technical term – I just build the things!).

After building the network, you need to train it. This is the easiest part – you just need a large database of music, and tags for every one of those tunes.

One handy idea: if you’re training a 5 second network (for example), then a 3 minute song has at least 36 completely separate training sets for you to sample – all you need to do is start linking to the inputs at second 0, 1, 2, .5, etc, and the network will see what it thinks (initially) is a completely different data set.

After training this for a while, you should be able to run a few seconds of a song through the network and have fairly accurate results of how “funk” or “jazz” a song is.

Conversion

After figuring out the above, I started thinking of alternative uses for the idea, and one surprising idea took hold.

Let’s say that you have a “folk” song played on guitar and violin. How would you go about making it “metal”? You could start by fuzzing the violin and distorting the guitar, and maybe adding some drums in.

I think it would be possible to write a program which lets you convert a song from one style to another literally at the click of a button.

Remember I mentioned that the output neurons should say how metal/classical/etc a song is, not just that it is or is not.

If the network is written with enough precision, then adjusting one or more of the input values should give a different value in the outputs.

As an example, let’s say you have a folk tune that you want to convert to neo-punk. Adjusting the inputs such that the sounds are more distorted (clipping high values, for example), or faster (shifting later inputs to the left, maybe) might change the tune’s “neo-punk” output from 0.00024 to 0.00025.

If you repeat this over and over (automatically, obviously), discarding changes that reduce the output and repeating changes that increase the output, until the “neo-punk” output reaches an acceptable threshold such as .9, then you have just created an automatic way to convert a tune from one style to another.

I think this has a lot of applications. For example, let’s say you want to convert a piano tune to guitar? You train your network to recognise what piano and guitar tunes sound like, and then simply convert as above!

Composition

This may be the simplest of the lot.

After creating the above programs, try inputting a sound sample of pure static into the conversion program, and tell it to convert the static to piano. I think it would come up with some interesting tunes. Maybe not completely accurate tunes, but they would be interesting.

I think the network would automatically learn rules about harmony and rhythm, but don’t think it would learn about structure. For example, you could train a network to recognise a 3/4 rhythm, but I don’t know if you could write something that recognises a fugue.

ToDo

List of things off the top of my head that I want to do:

  • write a book. already had a non-fiction book published, but I’d love to have an interesting an compelling original fiction idea to write about. I’m working on a second non-fiction book at the moment.
  • master a martial art. I have a green belt in Bujinkan Taijutsu (ninja stuff, to the layman), but that’s from ten years ago – found a Genbukan teacher only a few days ago so I’ll be starting that up soon (again, ninja stuff).
  • learn maths. A lot of the stuff I do involves guessing numbers or measuring. it’d be nice to be able to come up with formulas to generate optimal solutions.
  • learn electronics. what /is/ electricity? what’s the difference between voltage and amperage? who knows… I’d like to.
  • create a robot gardener. not just a remote-control lawn-mower. one that knows what to cut, what to destroy, that can prune bushes, till the earth, basically everything that a real gardener does.
  • rejuvenate, or download to a computer, whichever is possible first. science fiction, eh? you wait and see…
  • create an instrument. I’m just finishing off a clavichord at the moment. when that’s done, I think I’ll build another one, based on all the things I learned from the first. followed by a spinet, a harpsichord, a dulcimer, and who knows what else.
  • learn to play an instrument. I’m going for grades 2 and 3 in September for piano. I can play guitar pretty well, but would love to find a classical teacher.
  • write a computer game. I have an idea, based on Dungeon Keeper, for a massively multiplayer game. maybe I’ll do it through facebook…
  • write programs to:
    • take a photo of a sudoku puzzle and solve it. already wrote the solver.
    • take a photo of some sheet music and play it.
    • show some sheet music on screen, compare to what you’re playing on a MIDI keyboard, and mark your effort.
    • input all the songs you can play on guitar/keyboard. based on the lists of thousands of people, rate all these songs by difficulty, to let you know what you should be able to learn next.
    • input a job and your location. have other people near you auction themselves to do the job for you. or vice versa: input your location, and find all jobs within walking distance to you where you can do an odd job for some extra cash (nearly there: http://oddjobs4locals.com).
    • takes a photo and recognises objects in it (partly done)
    • based on above, but can also be corrected and will learn from the corrections (also partly done)
  • stop being damned depressed all the time.

There’s probably a load of other stuff, but that’s all I’ve got at the moment!

visual sudoku solver

The plan for this one is that, if you’re doing a sudoku puzzle in the pub or on the train, and you get stuck, you just take a snapshot of the puzzle with your camera-phone, send the photo to a certain mobile number, and a few seconds later the solution is sent back as an SMS message. The solution costs you something small – 50 cents, maybe.

How it works is that the photo gets sent via SMS to a PHP server, which manipulates the photo in a small number of steps:

  1. Find the corners of the puzzle.
  2. Extract the numbers and identify them.
  3. Also identify whether the numbers were printed or hand-written.
  4. Solve the puzzle with the printed numbers.

In the above, the only difficult step is actually the first one. I’m still trying to figure that one out. I think it will involve a bit of maths, which should be interesting.

For identifying the numbers, you just need a small neural net. This is much easier than full-blown OCR, because in OCR, you have the additional problem of trying to identify what is a letter, what’s a word, and what’s empty space. In the Sudoku puzzle, though, there is a well-defined grid, and a certainty that each square contains just one number to be identified.

Identifying whether a number was handwritten or printed should also be easy – the colour of the figure will give it away. The colour of the puzzle frame will match the printed numbers, but will be slightly off from the hand-written figures. Of course, it won’t be as simple as that in practice, but that’s what I’ll be testing first.

Actually solving the puzzle is simple – I’ve already written a javascript sudoku solver, so it’s just a matter of porting it to PHP.

lesson learned – recurrent networks are sequential

Okay – I was watching paint drymy neural network learning today and something surprising happened about an hour into the sequence – suddenly, instead of getting the tests 90%+ correct, it was getting them almost 100% incorrect!

It took me a few moments to realise what had happened – for some reasons, I had decided to try teach the network the difference between two objects for a lot of sessions before adding a third ingredient. I thought that would teach the network to very finely know the difference between them.

It turns out, though, that the network was smarter than me. It learned that the answer to the current test is… the exact opposite of whatever the previous answer was. The damn thing learned to predict my tests!

So – what I’ve done to correct this, is that ta the beginning of any test, I now erase the neural memory in the network. In effect, I’m whacking the network on the side of the head so it can’t learn to predict my sequences and must learn the actual photos instead.

ANN progress

Last week, I posted a list of goals for my neural network for that weekend. I’ve managed most of them except for the storing of the network.

Didn’t do it last weekend because I was just plain exhausted. After writing the post, I basically went to sleep for the rest of the weekend.

demo (Firefox only)

The demo is not straightforward. Usually, I create a demo which is fully automatic. In this case, you need to work at it. In this case, you need to enter an image URL (JPG-only for now), a name (or names) for a neuron which the photo should answer “yes” to, and similar for “no”, then click “add image”.

The network then trains against that image and will alert you when it’s done. The first one should complete pretty much instantly. Subsequent ones are slower to learn (in some cases, /very/ slow) as they need to fit into the network without breaking the existing learned responses.

I’ve put a collection of images here. The way I do it is to insert one image from Dandelion, enter “dandelion” for yes and “grass,foxtail” for no. Then when that’s finished, take an image from Foxtail and adapt the neuron names, then take one from Grass, and similar. Over time, given enough inputs, I think the network would train well enough to learn new images almost as fast as the first one.

Onto the problems… I spent all weekend on this. First problem is with the sheer number of inputs – with an image of 160×120 size, there are 19200 pixels. With RGB channels, that’s 57600!

So the first solution was to read in the RGB channels and change them to greyscale – a simple calculation – that cut the inputs by two thirds.

Then there was the speed – JavaScript is pretty fast these days, but it’s nowhere near as fast as C, Java or even Flash. If a loop runs for too long, an annoying pop-up appears saying “your browser appears to have stopped responding”. So, I needed to cut the training algorithms apart so they worked in a more “threaded” fashion. This involved a few setTimout replacements for some for(...) loops.

Another JS problem had to do with the <canvas>. There is no “getPixel()” yet for Canvas, so I get to rely on getImageData which returns raw data. According to the WhatWG specs, you cannot read the pixels of a Canvas object which has ever held an image from an external server. So, I needed to create a simple Proxy script to handle that. (just thought of a potential DOS for that – will fix in a few minutes)

Another problem had to do with the network topology. Up until now, I was using networks where there were only inputs and outputs – no hidden neurons. The problem is that when you train a network to recognise something in a photo based on the actual image values, you are relying on the exact pixels, and possibly losing out on quicker tricks such as combining inputs from neurons which say “is there a circle” or “are there parallel lines”. Also, networks where all neurons can read from all inputs are slow. A related problem here is that it is fundamentally impossible to know exactly what shortcut neurons would give you the right answer (otherwise there would be no need for a neural network!).

So, I needed to separate the inputs from the outputs with a bunch of hidden neurons. The way I did this was to have a few rules – inputs connect to nothing, hidden neurons connect to everything, outputs connect to everything except inputs.

The hope with this is that the outputs would read from the hidden neurons, which would pick up on some subtle undefinable traits they spot in the inputs. So the outputs would be almost like a very simple network “overlaid” on the more complex hidden neurons.

Problem is – how do you train these hidden units in a meaningful way? I’m still working on this… Right now, it’s based on “expectation” – the output neuron, when it produces a result, is either right or wrong. If the neuron is then to be trained, then it tells the hidden units that it relies on what output it expected from those units in order to come to the correct conclusion. The hidden units wait until there have been a number of these corrections pointed out to them, then it adjusts accordingly. (two thoughts just occurred to me – 1; try adjusting the hidden neurons after every ‘n’ corrections (instead of at arbitrary points), and 2; only correct a hidden unit if the output neuron’s result was incorrect.)

Another thing was that there is no way of knowing how many hidden units are actually required to have the network produce the right results, so the way I get around this is to train the network, and every (hidden units * 10) cycles, if a solution has still not been found, add another hidden network. This means that at the beginning of training, hidden units will be added pretty regularly, but as the network matures, new neurons will be added at rarer periods.

I’ve caught the ANN bug. After spending the weekend on this, I still have more ideas to work on which I’ll probably do during the week’s evenings. Some ideas:

  • Store successful networks after training.
  • Have a number of tests installed from “boot” instead of needing them to be entered manually.
  • Disable the test entry form when in training mode.
  • Work some more on the hidden unit training methodology – it’s still way off.
  • Allow already-inputed training sets to be edited (for example, if you want to add “rose” to the “no” list of “grass”).

I think that a good network will have a solid number of hidden neurons which do not change very often. The outputs will rely on feedback from each other as well as those hidden units. For example, “foxtail” could be gleaned from a situation where the “dandelion” is a definite no, and “grass” is a vague yes, along with some input from the hidden units approximating to “whitish blobs at the top” (over-simplification, yes, but I think that’s approximately it).

update It turns out the network learns very very well given just two neurons to learn – I’ve managed to get the network to learn almost every image in “dandelions” and “grass” with only 4 hidden units and 2 output neurons.

today's ANN goals

I’ve been doing well over the last two weeks – I started with an ANN which can balance a pole, and upped that by then creating a net which could recognise letters.

The plan for today is a bit more ambitious. I’m writing it down here in case it takes longer than a day to write it. In general, I want to write a PHP application which will allow you to upload images, which the ANN will then try to recognise. If it gets it right, all well and good. If it gets it wrong, you can correct it.

Some milestones for the project:

  • readers can upload images and have them tested and/or added to the training sequence
  • extraction of image pixels using <canvas>
  • automatic creation of new neurons as they are required
  • best net is stored on server, so new readers always start with a working net

I think this may grow into a damned cool thing – I’m already thinking of other cool features like distributed nets, or background nets which can be placed in other pages of a site so the thing can continue training even though the user is not actively viewing the thing (that might be a bit cheeky though).

Anyway – now that I’ve written what I intend to do, I suppose I’d better actually do it.

learning how to learn

Yesterday’s attempts at ANN training seemed at first to be successful, but I had overlooked one simple put curious flaw – there was training going on all the time, and each test was run 10 times… This means that each neuron was trying different values all the time until it got the right one, then it would be the next neuron’s turn (a bit of a simplified answer, but I don’t know how to describe what was actually happening). This ended up causing the tests to look a lot more successful than they actually were.

This morning, I did a lot of work figuring out how to get around the problem. It turns out the problem is not with the neural network – that appears to be working perfectly. The problem is with the method of training.

Just like with people, you cannot just throw a net into a series of 26 tests which it has never seen before and expect it to learn it any time soon. For any particular neuron, 25 of the tests will have “No” as the answer, and it is too easy for the neuron to just answer “No” to everything and get a 96+% correct answer.

Instead, you need to start with just one test, keep trying until that’s right, then add another test, keep going until they’re both right, then add another test, etc.

Even that was not enough, though – it turns out that the capital letters “B” and “E” are very similar to each other, as are “H” and “M” (at least, in my sequence, they are).

I managed to improve the learning of the neurons by following rules such as these:

  • If a test’s answer is “No” but the neuron says “Yes” (ie; has a returned value above 0), then adjust the weights of the neuron it (correct/punish it).
  • If a test’s answer is “Yes”, and the neuron is anywhere less than 75% certain of Yes, then adjust/reward the neuron.

In all other cases (neuron is certain of Yes and is right, or neuron is vaguely sure of No and is right) leave the neuron’s weights alone.

This has helped to avoid the problem where neurons get extremely confident of Yes or No and are hard to correct when a similar test to a previously learned one comes along (O and Q for example).

It’s still not perfect, but perfection takes time…

letter recognition network

Last week, I wrote a neural network that could balance a stick. That was a simple problem which really only takes a single neuron to figure out.

This week, I promised to write a net which could learn to recognise letters.

demo

For this, I enhanced the network a bit. I added a more sensible weight-correction algorithm, and separated the code (ANN code).

I was considering whether hidden inputs were required at all for this task. I didn’t think so – in my opinion, recognising the letter ‘I’ for example, should depend on some information such as “does it look like a J, and not like an M?” – in other words, recognising a letter depends on how confident you are about whether other values are right or wrong.

The network I chose to implement is, I think, called a “simple recurrent network” with stochastic elements. This means that every neuron reads from every other neuron and not itself, and corrections are not exact – there is a small element of randomness or “noise” in there.

The popular choice for this kind of test is a feed-forward network, which is trained through back-propagation. That network requires hidden units, and each output (is it N, it it Q) is totally ignorant of the other outputs, which I think is a detriment.

My network variant has just finished a training run after 44 training cycles. That is proof that the simple recurrent network can learn to recognise simple letters without relying on hidden units.

Another interesting thing about the method I used is how the training works. Instead of throwing a huge list of tests at the network, I have 26 tests, but only a set number of them are run in each cycle depending on how many were gotten right up until then. For example, a training cycle with 13 tests will only be allowed if the network previously successfully passed 12 tests.

There are still a few small details I’d want to be sure about before pronouncing this test an absolute success, but I’m very happy with it.

Next week, I hope to have this demo re-written in Java, and a new demo recognising flowers in full-colour pictures (stretching myself, maybe…).

As always, this has the end goal of being inserted in a tiny robot which will then do my gardening for me. Not a mad idea, I think you’re beginning to see – just a lot of work.

update As I thought, there were some points which were not quiet perfect. There was a part of the algorithm which would artifically boost the success of the net. With those deficiencies corrected, it takes over 500 cycles to get to 6 correct letters. I think this can be improved… (moments later – now only takes 150+ to reach 6 letters)

neural net balancing thing

Last century, when I worked for Orbism, a co-worker, Olivier Ansaldi (now working for Google) showed me a Java applet he was working on which learned how to balance a stick using a neural net.

I decided to try it myself, and wrote a neural net that does it yesterday.

demo (Firefox only)

There is a total of 3 neurons in the net – 1 bias, 1 input (stick angle) and 1 output.

It usually takes about 20 iterations to train the net. Sometimes, it gets trained in such a way that the platform waggles back and forward like a drunk, and sometimes it gets trained so perfectly that it’s damn boring towatch (basically, it’s a platform with a stationary stick on it).

Anyway… for my next trick, I’ll try building a net which can recognise letters and numbers.

Partly the point of this was that it was an itch I wanted to scratch. But also, I wanted to write it in C++ but am better at JavaScript so wrote it in JS to test it first before I attempt a port.

I’m getting interested in my robot gardener idea again, so am building up a net that I can use for it.

Some points about how this differs from “proper” ANNs.

  • Training is done against a single neuron at a time (not important in this case, as there are only three neurons anyway).
  • This net will attach all “normal” neurons to all other neurons. I don’t like the “hidden layer” model of ANNs – I think they’re limited.
  • No back-prop algorithms are used – I don’t trust “perfect” nets and prefer a bit of organic learning in my nets.
  • The net code itself is object-oriented and self-sufficient. It would be possible to take the code and use in another JS project.