]]>

In the course of working through some (very good) material on neural networks (which I may try to work through here later), I noticed that it was beneficial for a so-called “activation function” to be able to be written as the solution of an “easy” differential equation. Here by “easy” I mean something closer to “short to write” than “easy to solve”.

In particular, two often used activation functions areand

One might observe that these satisfy the equations

and

By invoking some theorems of Picard, Lindelof, Cauchy and Lipschitz (I was only going to credit Picard until wikipedia set me right), we recall that we could start from these (separable) differential equations and fix a single point to guarantee we would end up at the functions above. In seeking to solve the second, I found after substituting *cos(u) =τ *that

and shortly after that, I realized I had no idea how to integrate *csc(u)*. Obviously the internet knows (substitute *v = cot(u) + csc(u) *to get the integral being –*log(cot(u)+csc(u))*), which is a really terrible answer, since I would never have gotten there myself.

Instinctually, I might have tried the approach to the right, which gets you back to where we started, or by changing the numerator to *cos ^{2}*x+

Avoiding the overwhelming temptation to split this integral into summands (which would leave us with a

Now substituting

Using the half angle formulae that everyone of course remembers and dropping the *C* (remember, there’s already a constant on the other side of this equation), this simplifies to (finally)

Subbing back in and solving for gives, as desired,

.

Phew.

]]>

There is also a “Message” class inside where you can play around with how many corrupted bits your message might have versus how long your message is versus the size of Hamming matrix you use. The defaults are set at 3 corrupt bits in a set of 4000 bits, with the error checking done with the Hamming(7,4) code. You can run this by downloading hamming_codes.py and running “python hamming_codes.py” from the command line.

The specific directory with this project is located inside the “HammingCodes” folder. Possible experiments with this code later, but now I need sleep!

]]>

Specifically, someone will receive a (possibly corrupt) 7-bit string **v**, and we want a matrix that will output **0** if all is well, or communicate a number 1-7 if one of the bits is corrupt. It takes 3 bits to communicate 8 numbers (8 = 2^{3}), so our parity matrix *H* (following wikipedia’s notation) must be 3 x 7. To make it easy to remember, we’ll even have column *j* be the binary representation for *j*. More directly:

Now we can work backwards (again, we’re *assuming* an answer exists), and for reasons that may be clear later, we’ll set our three parity bits to be the three “singleton” columns of *H*, so that the “coded” message **v =** (p1,p2,d1,p3,d2,d3,d4). Then if everything goes smoothly, we have that *H***v** = **0**, so that

0 = p1+d1+d2+d4

0 = p2+d1+d3+d4

0 = p3+d2+d3+d4.

Notice also that if one bit gets corrupted, this is equivalent to sending the message **v+e**_{j}, and

*G*(**v+e**_{j}) = **0+g**_{j},

where **g**_{j} is the *j*th column of *G* (which is the binary representation of the number *j*). Hence multiplying a message with a 1-bit mistake gives us the index of the corrupt bit.

But this tells us how we must *encode* our message **m = **(d1,d2,d3,d4) as well. We want a matrix *G* so that *G***m **= **v **= (p1,p2,d1,p3,d2,d3,d4). But the above gives us a linear condition for what this matrix must look like (and an explanation for why the parity bits are all “singletons”).

Finally we want to “decode” our message, which is also straightforward at this point, since it will just be the matrix which returns the non-parity bits from the encoded message.

As a review, and to wrap everything up:

1. Start with a message **m = **(1,0,0,1)

2. Transmit the message **v = ***G***m** = (0,0,1,1,0,0,1)

3. Check the parity by confirming that *H***v** = (0,0,0).

4. Decode the message R**v** = (1,0,0,1), as desired.

Wikipedia’s also got an explanation involving Venn diagrams which I did not much like, though I *may* write a bit about Venn diagrams themselves in the future…

]]>

Week summary on Strava.

]]>

So now we know the general idea behind Hamming error correcting codes, and how one might construct and visualize hypercubes. Now suppose we want to encode, in an error correcting way, 4 bits. Recall that this means finding a hypercube with enough vertices that we can designate 16 (=4^{2}) of them, *and* pick those 16 so that no two “symbol vertices” are closer than distance 2. This means each “symbol vertex” has a disjoint neighborhood of distance 1.

A back of the envelope calculation gives a necessary condition to allow this: an *n* dimensional hypercube has 2^{n} vertices and each vertex has *n* neighbors (so a “symbol neighborhood” takes up *n*+1 vertices). Hence it is necessary that *n* satisfy

16*(*n*+1) ≤ 2^{n}.

More generally, to encode *m* bits in *n* bits, we require 2^{m}*(*n*+1) ≤ 2^{n}. Note without proof (for now, hopefully soon by construction) that this is also a sufficient condition. Interesting from an efficiency point of view is seeing where equality exists.

Taking logs (base 2) of both sides, and realizing that log(*n*+1) is an integer only when (*n*+1) is a power of 2, so *m = n-*log(*n+1*), or, letting *n *:=2^{k}-1, *m = *2^{k}-1 – *k*. In fact, one may (and we *may*) describe a whole class of Hamming (2^{k}-1, 2^{k}-1 – *k*) codes.

]]>

1. vertices are located at coordinates made of entirely 0’s and 1’s, and

2. has an edge wherever two vertices are distance 1 apart.

This would take two more things to make a complete definition: I should let you move the cube about however you like (no reason to have it fixed is space), and I should tell you about the 2-D faces, 3-D hyperfaces, and so on up to the (*n**-1*)-D hyperfaces. You can use that first one if you want, but I’ll ignore the second. I think I did a good job of defining what’s called the 1-skeleton of a very particular *n*-dimensional hypercube.

Anyways. Wednesday had pictures of a 2-cube and 3-cube. What about the 4-cube? Or 5-cube? It will help to consider this all from a less analytic, more graph theory (or, if that sounds technical, “pictures and problem solving”) point of view. Condition 1 for a hypercube says that there are 2^{n} vertices, all the binary sequences of length *n*. Then condition 2 says that two vertices are connected if you can change one vertex’s binary sequence to the other’s by changing a single bit. We’ll go one step further, by just coloring particles on a line: white for 0, black for 1 (this is something of a homage to my undergraduate thesis advisor’s work with polyhedra).

The only two things left to do are to draw the vertices and arrange them in nice ways (that is, fine a “nice” projection).

Below is the image from the wikipedia 5-, 6-, and 7- cubes. Note the some of the vertices are laying on top of eachother. I’ll leave it as an exercise to the reader to label these vertices with the appropriate binary sequences.

]]>

Ran Friday night and really hammered about 2 miles at the end, forgetting I had a workout this (Saturday) morning. Ended up only doing 20mins of a planned 30min tempo, and really struggled home this morning. Overall still happy with week 1 of marathon training, but I’ll have to be better with gauging effort in the future. Will likely break out the HRM in the future.

On the plus side, blew away a segment record on my Friday run, going 5:17/mi for 1.2 miles near the end.

]]>

]]>

01101,

but there’s a chance that you receive instead something like 00101 (so the second bit was flipped), or 01111 (so the fourth bit was flipped). Is there any way to control for this? An idea that is both natural and wrong is to send two copies of the message. In this case, I’d send along

0110101101.

Now if a bit gets flipped (note that there are now more bits that could be flipped), you can see exactly where — perhaps you receive

0**0**1010**1**101,

where the non-matching bits are highlighted. The problem here is that you cannot tell whether the offending bit should be interpreted as a 0 or a 1 (which might be ok if you are only interested in error *detection)*. But if we want to be able to correct the error, we’ll have to be a little more clever. As a very broad outline of our strategy, we are going to take each of the symbols we would like to transmit and encode them into a “big enough” space so that each symbol has a buffer zone around it to account for any errors in transmission.

In some sense, this is a familiar strategy: on touchscreen keyboards you do not have to hit the *exact* spot for a key to work, only be close enough. In this case, the “symbols” I’m referring to are letters, and the “buffer zone” is the area of the screen you can touch so the computer recognizes that you are trying to touch this particular letter.

The trick here (and what is sort of confusing about this) is that the symbols we wish to transmit are bits, and the space that we will be embedding our symbols into will also be bits (rather than a shiny retina display!) As a hint of things to come, I’ve displayed below a representation of the space of 2-bit strings (which will not be big enough to detect 1-bit errors), and a representation of the space of 3-bit strings, which is of perfect size to detect 1-bit errors in a 1 bit message.

]]>