08 Jan

idea for music recognition, conversion and composition using artificial neural networks

I had this idea while walking the kids to school. Starting from a simple network that can classify music styles as rock/metal/classical/folk/etc, I think that it would be possible to adapt the same algorithm to convert a music file from one style to another, and even write music from scratch in whatever style you want. And if I’m right, I think it would be very simple to write.

Recognition

This is the simplest task. To recognise the style of a music file, all you need is a feed-forward network with a few thousand inputs, at least one hidden layer, and one output for each style you want to recognise.

A standard data rate for recorded music is 160kbps. That means that every second, there are 10,240 separate wave heights (160*1024/16) that need to be examined. Of course, you can recognise music using lower bps values, but let’s use the same setting for the whole process (160 will be wanted for later parts).

So, the input layer would need 10,240*n inputs, where n is the number of seconds you want the network to sample in order to determine the style. In some cases (metal/classical), you may get away with sampling just a single second, but for better results, you might want a larger value. I’ll be setting n to 300, so it samples the entire song in most cases. This makes it easier to be accurate about the result, but will also be useful in a later stage.

The output layer needs to have one node per tag you want to measure. For example, you might have an output that measures how “rock” a song is, and another that measures how “baroque” it is. You could use output nodes that return a simple Yes/No result, but there is a good reason to return a more linear certainty instead (which we’ll get to).

The hidden network needs at least one neuron, obviously, but I don’t think there is any way to say exactly how many it needs, so it would be better to use a network model which grows automatically as it learns (I don’t know the technical term – I just build the things!).

After building the network, you need to train it. This is the easiest part – you just need a large database of music, and tags for every one of those tunes.

One handy idea: if you’re training a 5 second network (for example), then a 3 minute song has at least 36 completely separate training sets for you to sample – all you need to do is start linking to the inputs at second 0, 1, 2, .5, etc, and the network will see what it thinks (initially) is a completely different data set.

After training this for a while, you should be able to run a few seconds of a song through the network and have fairly accurate results of how “funk” or “jazz” a song is.

Conversion

After figuring out the above, I started thinking of alternative uses for the idea, and one surprising idea took hold.

Let’s say that you have a “folk” song played on guitar and violin. How would you go about making it “metal”? You could start by fuzzing the violin and distorting the guitar, and maybe adding some drums in.

I think it would be possible to write a program which lets you convert a song from one style to another literally at the click of a button.

Remember I mentioned that the output neurons should say how metal/classical/etc a song is, not just that it is or is not.

If the network is written with enough precision, then adjusting one or more of the input values should give a different value in the outputs.

As an example, let’s say you have a folk tune that you want to convert to neo-punk. Adjusting the inputs such that the sounds are more distorted (clipping high values, for example), or faster (shifting later inputs to the left, maybe) might change the tune’s “neo-punk” output from 0.00024 to 0.00025.

If you repeat this over and over (automatically, obviously), discarding changes that reduce the output and repeating changes that increase the output, until the “neo-punk” output reaches an acceptable threshold such as .9, then you have just created an automatic way to convert a tune from one style to another.

I think this has a lot of applications. For example, let’s say you want to convert a piano tune to guitar? You train your network to recognise what piano and guitar tunes sound like, and then simply convert as above!

Composition

This may be the simplest of the lot.

After creating the above programs, try inputting a sound sample of pure static into the conversion program, and tell it to convert the static to piano. I think it would come up with some interesting tunes. Maybe not completely accurate tunes, but they would be interesting.

I think the network would automatically learn rules about harmony and rhythm, but don’t think it would learn about structure. For example, you could train a network to recognise a 3/4 rhythm, but I don’t know if you could write something that recognises a fugue.

13 Jun

Simple geo-ip based links

simple geoip based links, for when you need to link to different files depending on the client’s country. requires PHP, jQuery.

in the head of the document, have this:

<script>window.geoip_data='<?php echo file_get_contents('http://freegeoip.net/json/'.$_SERVER['REMOTE_ADDR']); ?>'</script>

for the HTML links, write the default link target into the HTML, with alternatives written in for each country. here’s an example for the UK and Ireland:

<a href="/link.html" class="geo" data-link-UK="/link-uk.html" data-link-IE="ie.html">click here</a>

now in the JavaScript, process all .geo links:

$(function() {
    var country=geoip_data.country_code.substring(0,1)
    +geoip_data.country_code.substring(1,2).toLowerCase();
  $('a.geo').each(function() { 
    var $this=$(this), dataName='link'+country;
    if ($this.data(dataName)) { 
      $this.attr('href', $this.data(dataName));
    } 
  });
});

Done!

12 Feb

File server, using Raspberry Pi as the controller

This weekend’s task is to convert an ageing and awkward file server into something more manageable. I’ll do this by basically replacing the current controller (an old laptop) with a Raspberry Pi, and shoving everything into a box.

Firstly, an apology: I’m a hacker, in that when I create something, it will work, but it may not be pretty. I’ll leave the works of art to others, and focus on what I’m good at: solving the current problem with what I currently have at hand.

Here is the old system:

IMAG1051

Four hard-drives, a laptop, a USB hub, lots of cables, lots of power supplies.

The goal is to cut it all down into just one box, one power cable, one ethernet cable.

IMAG1148

So, first, I got an old ATX power supply from a machine I had lying around, and converted it so it didn’t require a motherboard to run.

IMAG1038

To do this, open up the power supply box, remove any cables (leading to outside the box) that are not red, orange, black or yellow, and connect the green cable to one of the blacks (remove some of the insulation from the black cable first, obviously). The green cable tells the power supply that it’s okay to turn on.

IMAG1048

While in there, I cleaned out some of the dust and fluff, and neatened the cables a little. There are four voltages that come from the power supply unit: 0v (black), 3.3v (orange), 5v (red), 12v (yellow). I wasn’t sure how many of each of these cables are actually needed to supply enough current, so when I was pruning cables from the ATX box, I just left these all intact.

Next, I laid out everything in a way that I thought made sense:

IMAG1056

If the cables are removed, then the above will fit into a box that’s 27cm x 27cm x 18cm, with a few millimetres spare on sides and top, and a few centimetres at the back. We need the space at the back so we can re-organise the cables.

So, I cut out the pieces for the box walls. The wood I had (which I scavenged from the attic “floor”) is 1.2cm in width, so the panels I cut are:

  • two 27cm x 27cm, for bottom and top of the box
  • two 27cm x 15.6cm, for front and back
  • two 24.6cm x 15.6cm, for side walls

The last two, I left a little shy of 15.6. Partly because of ventilation, but mostly because I forgot the old adage to measure twice before cutting.

I put the floors and walls of the box together, with wood glue and screws.

IMAG1060

Next, I needed to make the power rails. After a little thought, I decided to use two coat hangers (yes, seriously!), straightened then cut into four lengths of 27cm each. The resistance of each length was between .5 and .7 ohm, so I was pretty sure they’d do fine.

IMAG1059

I drilled holes at the back end of the side walls and put the rails in. I’ve been told since building the thing that this setup may cause electro-magnetic interference – especially as the rails look a bit like aerials. If it causes a problem, I think I’ll need to coat the box with metal.

IMAG1061

Next, I started wiring up the PSU. First, I placed the PSU in the box and measured roughly how long the cables should be, then I cut them about 3cm longer than that, and stripped the extra 3cm of each wire.

IMAG1062

I then tied each wire to a rail making sure the stripped area of the wire was in full contact with the rail. Orange (3v) at the bottom as it’s unlikely to be used, then red (5v), black (earth) and yellow (12v) at the top. In retrospect, the 12v and 5v should be the other way around for neatness, but I didn’t think of that at the time.

IMAG1063

Most hard drive enclosures use a 12v input, so what I did next was to take the hard drive power supplies, snip the cables so they were about 15cm long from the hard-drive end, then strip the ends and wire them up. The cables were wrapped in black plastic, the “live” wire having a broken white line printed along it. I connected the live wire to the 12v rail and the other to the 0v.

IMAG1064

Next, I took the case off the USB hub, and screwed it and the Raspberry Pi to the underside of the case’s box.

IMAG1066

The power supply for the USB hub is 5v, and when I stripped it, I found this was indicated by the live wire (5v) being red. I left this cable longish (about 30cm), so I can hinge the top of the case and open it without disconnecting anything.

Power for the Raspberry Pi is supplied through a micro-USB cable. I scrounged one from somewhere in the house and stripped it down to about 30cm in length. The cable has four wires – black, red, green, white. We only need the red (5v) and black (0v).

Finally, I inserted all the hard-drives with USB cables connected, and hooked it all up. I’ll need to get shorter USB cables to neaten it further, but the original cables will do for now.

IMAG1148

Software-wise, I used Raspbian for the operating system, and ZFS for connecting the drives together and serving to the network as one large hard-drive.

28 Sep

networking a raspberry pi through your laptop

I finally got my own Raspberry Pi; a credit-card-sized computer that’s very cheap and low-power.

It didn’t come with any of the niceties that you would expect from another computer, such as a power supply or a case, or keyboard, monitor, or anything else. Basically, it’s like being given the motherboard of a desktop computer and you need to do the rest yourself.

So first thing was to install the operating system on it. This was easy. Just buy an SD card, and download the ISO of the OS that you want and copy it onto the card.

I already had a micro-SD card from an old Bada phone, so I just stuck that in an adaptor to bring it up to SD card size, then installed the Fedora remix using the Fedora Arm Installer. Painless.

Raspberry Pi connected to laptop using cross-over cable

Next, we need to connect the machine to the network.

My network is mostly WiFi-based, so I chose to hook the RaspPi to the network by piping its network through my laptop.

We need to set up the laptop so it can hand out IP addresses. Install a DHCP server on your laptop. I’m using Fedora, so installed with “yum install dhcp”, then edited /etc/dhcp/dhcpd.conf:

option domain-name "example.org";
option domain-name-servers ns1.example.org, ns2.example.org;
default-lease-time 600;
max-lease-time 7200;
log-facility local7;
subnet 10.5.5.0 netmask 255.255.255.224 {
  range 10.5.5.26 10.5.5.30;
  option domain-name-servers ns1.internal.example.org;
  option domain-name "internal.example.org";
  option routers 10.5.5.1;
  option broadcast-address 10.5.5.31;
  default-lease-time 600;
  max-lease-time 7200;
} 

Then start the DHCP server with “service dhcpd start”.

Next, we need to connect the laptop to the RaspPi. To do this, I made a cross-over cable, and plugged it into the RaspPi.

Before plugging it into the laptop, we need to tell the laptop’s network manager not to set up DHCP over eth0 (as we’re the server, not the client, as far as the cable is concerned). To do this in Gnome, right-click your netowkr icon on the top-right, click Network Settings, and in Wired, click Options, then change the type to “Shared connection” (or whatever sounds like that).

Now plugin the ethernet cable into the laptop, then plug a USB cable into the laptop and the RaspPi.

If you “tail -f /var/log/messages”, you should get something like the below after a minute:

Sep 28 21:12:30 iga dnsmasq-dhcp[30147]: DHCPDISCOVER(em1) 192.168.1.19 b8:27:eb:87:1d:86
Sep 28 21:12:30 iga dnsmasq-dhcp[30147]: DHCPOFFER(em1) 10.42.0.62 b8:27:eb:87:1d:86
Sep 28 21:12:30 iga dnsmasq-dhcp[30147]: DHCPREQUEST(em1) 10.42.0.62 b8:27:eb:87:1d:86
Sep 28 21:12:30 iga dnsmasq-dhcp[30147]: DHCPACK(em1) 10.42.0.62 b8:27:eb:87:1d:86 raspi

That 10.42.0.62 is the address of the RaspPi. You can ssh into it (username root, password fedoraarm), and do stuff!

02 Aug

installing kv-WebME in CentOS 6

I’m setting up a new server and need to install kv-WebME in it. The previous instructions (for Fedora 16) were fine for an older version of WebME, but the most recent requires more uptodate packages.

So, first, we need to tell CentOS 6 to use more uptodate packages than are provided by default.

su -
rpm --import  http://apt.sw.be/RPM-GPG-KEY.dag.txt
rpm -ivh http://packages.sw.be/rpmforge-release/rpmforge-release-0.5.2-2.el6.rf.x86_64.rpm

Now we need to install PHP (at /least/ 5.3), Apache, MySQL, etc.

yum install httpd mysql-server php php-gd php-mysql php-xml git zip unzip

Add a user account for the website, and download the kv-webme repository

adduser webme
su - webme
git clone https://github.com/kaeverens/kvwebme webme
chmod 755 /home/webme
exit

Now add the web configuration to /etc/httpd/conf/httpd.conf

NameVirtualHost *:80
<Directory /home/webme/kv-webme>
  AllowOverride All
</Directory>
<VirtualHost *:80>
  DocumentRoot /home/webme/kv-webme
</VirtualHost>

And finally, turn it all on.

service httpd start
service mysqld start
chkconfig --level 35 mysqld on
chkconfig --level 35 httpd on

That’s it!

26 Jul

straightening an image of horizontal lines

I’m working on a mobile app for photographing sheet music and then playing it.

When I first approached this, I considered using a Hough transform, which is a mathematical tool for finding lines in an image. It produces a matrix based on MC space (the tangent and y offsets of the lines).

I could then use the matrix to figure out what was being shown on the sheet.

That method is very computationally expensive.

After considering it, trying it, then abandoning it, a better solution came to me while I was thinking about something totally different.

Sheet music is composed of mostly horizontal lines, while everything that is not a horizontal line is part of the notation itself.

So, all I need to do is first locate the horizontal lines, and everything else will be easy to find.

The first problem, then, is how to make sure that the sheet itself is level.

How I ended up doing this was to measure mean difference of the average colours of each ‘y’ coordinate of the image, and try offsetting one side of the sheet up and down until I reached the maximum mean difference.

This is easier to understand visually.

Let’s consider this image:

As a human, we find it easy to spot the skew and fix it, but a computer is not so intuitive.

Here is the same image with the “x” coordinates of each “y” coordinate averaged out (motion-blurred, basically)

That’s a simple average of the “x” coordinates, and there already appears to be a pattern.

Next we shift/skew one side of the image up or down a few pixels and test it again. In my tests, I use a naive “brute-force” test of all offsets from -15 to +15. Here are blurs of a -11 offset and a +11 offset:


-11

+11

Obviously, the right one is the -11 one. But how do we tell a computer what the “obvious” solution is?

Well, the right answer is probably to come up with a way to measure which one is more “noisy”, but I couldn’t think of a simple way to do that.

Instead, what I did was to measure the average colour in the each image, use that average to find the mean difference in each image (how far from the average “gray” each line is), and the one we are looking for is the one with the highest mean difference.

Having found the right offset (-11), we then simply shift the pixels in the image by that much (in Y and X space), and end up with these images:


original image

straightened

The next task is to fix skewing, but it will use basically the same technique.

demo

13 Jul

the awe of programming

“A few weeks into the class, there was a moment where I finally understood recursion. It felt so satisfying that my next thoughts went something like: ‘Wow, that’s awesome. I like that. I think I like computer science.'” – from betabeat.com

I can tell you that I know /exactly/ how she felt. I first “got” recursion (a programming method) after a summer scholarship in DCU, waaay back when I was in secondary school. It was really an awesome moment – understanding the possibilities of it felt like becoming one with the universe. It /really/ felt like that.

It felt “awesome”, and I mean that literally; I was in awe that such a simple concept could create such amazingly powerful solutions.

I’ve used recursion quite a lot over the years. In fact, only yesterday, I wrote a TSP algorithm that uses depth-first recursion to find the shortest distance between a number of points on a map. I’ve also used it for generating flow charts for food industry applications, creating breadcrumbs in website navigation, and for solving other seemingly unrelated problems.

As a father of two kids, I would love to have them take up my own path and become programmers, but I also know that you can’t “teach” the feeling of satisfaction/enlightenment that you get when you finally solve a tricky problem, and that feeling is very important to get early on if a child is to feel an urge to carry on.

Jareth (my son) doesn’t know it yet, but he’s getting Lego Mindstorms this year for Christmas. I already know he’s going to be a good programmer, based on his problem solving skills in some games, and some of the technic creations he’s built. Hopefully Mindstorms will let him have his own “ah hah!” moments early on, encouraging him to go deeper into programming as he gets older.

27 Jun

Raspberry Pi coming

I got an email earlier today saying that my Raspberry Pi would be dispatched in two weeks.

Exciting! I’m looking forward to getting back into my robot and finally giving it some intelligence.

I’ve been working on this robot idea for years, and always wanted to basically have a very very small programmable bot that could do some things intelligently, such as pick up rubbish, cut weeds, do a little but of mapping, etc.

Now I need to decide what language I’ll use for the programming.

I’ve been doing PHP and JavaScript professionally for about 15 years, but Java and C are probably more appropriate.

I think I’ll be brushing the dust off my C books!

27 Mar

code golf

I came across a new (to me) game yesterday – Code Golf.

The game involves coming up with an algorithm to solve a programming problem, and trying to condense the code for the algorithm into the smallest number of bytes possible.

The first one I tried was this. It was a fascinating problem, and took me a day of musing on it (in the back of my head as I did other things) before I had a solid solution.

After writing the solution, which took 1380 bytes, it was time to start “golfing” it.

At first, I thought I’d try my compressor on it. This shrank it to 825 bytes, but after compression, it couldn’t be worked with anymore, so I thought I’d try compressing it manually.

This resource was fascinating, and gave me a load of pointers.

There were a few small points I came up with myself while working on it:

I prefer to use “\n” instead of “;” for command endings, as it makes code more readable. (the game is about shrinking the code, not obfuscating it)

A saving can be had by combining nested loops:

// before
for(i=5;i--;)for(j=5;j--;)M[i][j]=0
// after
for(i=25;i--;)M[0|i/5][j%5]=0

If you need to push into an array on multiple lines, make a shortcut for the push method:

// before (example)
a=[]
cond1()&&a.push(1)
cond2()&&a.push(2)
cond3()&&a.push(3)
cond4()&&a.push(4)
// after
a=[]
P=a.push
cond1()&&P(1)
cond2()&&P(2)
cond3()&&P(3)
cond4()&&P(4)

If possible, find a maths way of identifying interesting points, instead of comparisons.

// before
if((x==4&&y==4&&z==3)||(x==4&&y==3&&z==4)||(x==3&&y==4&&z==4))dosomething()
// after
if(x*y*z==48)dosomething()

I think this game is really interesting, and it will sharpen my own skills as a programmer, as it taxes the mind not only to find the solution to a problem, but also to express that solution as concisely as possible.

In the end, I was able to solve the problem in 668 characters – that’s 142 characters less than my compressor was able to manage.

24 Mar

musical intervals trainer, web version

last weekend, I wrote an intervals trainer app for practicing recognising intervals.

I want other people to use it, but haven’t got a Google development account yet so can’t upload an app.

So, today, I improved the app and made a web-accessible version.

try it out!

it’s designed to move up from very simple intervals (major/minor 2nd intervals, with only natural notes) to more difficult intervals (diminished/augmented, with double sharps and flats), but it’s also designed to only get more difficult at a rate that /you/ can manage.

to do this, the app uses a “levels” system, where each level has one more extra type of interval or note type, and you are tested on them. over 50% of the time, the question will be from the level you’re on, and the rest of the time, the question will be randomly chosen from every other level that you have already passed.

get 10 in a row correct, and you go to the next level.

but, get 5 wrong in a row, and you go down a level.

at the moment, there are 24 levels – all the way up to augmented 8ths – can you get through all the levels?

give it a try!