❧ And I Thought HTML Was Supposed to Be a Real Markup Language
Every time I compile some C code I write I get – and I counted to make sure
– approximately 43 million errors, all of which are nagging me about
unbalanced parentheses and forgotten semicolons. And every time I gesticulate
violently and sternly address the screen, “If you’re so smart you fix it!”
Which is, to say, that the compiler doesn’t allow room for error. And it
shouldn’t, either. As soon as the compiler tries to be smarter than you are –
mind you, it is – and starts fixing your mistakes, it’ll inject some really
mind-blowingly stupid code that’ll leave you scratching your head and wondering
why you didn’t just fix it in the first place.
What I’m left wondering is why browsers accept bad code. They’re parsing a
language with syntax and specifications and all the paperwork to be a
legitimate language, yet they willfully drop into “quirks” mode to handle
malformed HTML. And the result is predictable: ordinary people don’t give a
hoot if their pages are valid because who the heck cares? It renders just fine,
doesn’t it?
Why, if my compiler doesn’t, should my browser bend over backward to render
pages that are invalid? No programmer would expect invalid code to compile, and
yet here we are, something like a decade after HTML was introduced, still
treating it as a baby. Can I say something? HTML is dead simple to write.
There are like two rules: every opening tag needs a closing tag, and some tags
– such as <a> – need specific attributes. Compare that to C, Python,
Java, etc. This isn’t rocket science.
The History of Quirks
“Quirks” mode harkens back the dark days of the internet. Fledgling web
developers (read: pubescent teenagers fiddling on Angelfire) were crafting
Web 1.0, replete with Tomb Raider walkthroughs and Real Ultimate Power
(which, as an aside, is still hilarious). And these pioneers had no time for
“syntax” or “rules”. How could they? Fueled only by raw vision and Tang the
internet was born.
Corporations were taking notice. “Why,” they said, “we could use this
newfangledness for intranets.” And they promptly tasked the most capable
employees: the aging site admins whose jobs were slowly being replaced by
computers. Well, if you can’t beat them – you know.
And there were the visionaries. We all know them. They were the darlings of
Wall Street: eBay, etc. These men and women were going to change the
world. On their Herman Miller aerochairs they gazed into their crystal balls
and revealed to the world its fate, largely a concoction of digital money,
internet grocery stores, and beanie babies. The world had enough by 2000, but
the bubble boys had left their mark(up. Ha!).
The browsers of the time – Netscape and Internet Explorer – were playing
second fiddle to the internet. Success was black and white: either you render
the web, or the other guy does. If Joe here sees a jumbled mess at
joeisawesome.com with Netscape he’ll do what’s rational: curse loudly and
open up Internet Explorer.
And in that dark race quirks mode came to light.
What Quirks Mode Means Today
Quirks mode means that people don’t care. Validating your site is like extra
credit on a test. Only that kid with the huge glasses is going to care if he
gets it right. Yeah, we’ll try it, but we’re too cool to care.
There’s already a lot of discussion on the subject and it’d be redundant
to bring it up. Suffice it to say that the way browsers handle code today is
not good.
A Modest Proposal
Kill quirks mode.
But seriously.
We can’t leave out half the web, can we? There’s a lot – a lot – of
content out there that’s not valid HTML. And never will be. This is content
that people rely on: popular websites, corporate intranets, your website
works-in-progress. And cutting out the quirks mode of every browser would mean
alienating a lot of people and making life much harder for others. We can’t
realistically say that cutting out quirks mode is a good thing (though it’s
what I’m personally rooting for). Not to mention that it’ll never happen.
But what if Google didn’t index invalid HTML?
(Giving higher priority to valid over invalid HTML would have a similar
effect.)
Google has a very large stick with which to brandish: if you’re not on Google’s
search results, you’re not on the internet. Plain and simple. Companies know
this and already invest heavily in search engine optimization. Turn web
standards into an SEO strategy and you’ll have even the most remote corners of
the web evangelized.
How could we ever get Google to drink the Kool-Aid and actually pull this off?
I haven’t the slightest clue. It’s a pie-in-the-sky dream. But it could work.
❧ Introduction to Groff
Groff is the GNU implementation of AT&T’s troff and associated programs
(pic, table, etc.). It’s a typesetting language much like LaTeX.
But who cares! Here’s the canonical helloworld.ms:
.TL
Hello World
.AU
Devrin Talen
.AI
Awesome, Inc.
.AB no
.AE
.PP
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim
veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea
commodo consequat. Duis aute irure dolor in reprehenderit in voluptate
velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat
cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id
est laborum.
Compile this into something useful with
% groff -ms -P-pletter helloworld.ms | ps2pdf - helloworld.pdf
Groff’s info page is chock full of good documentation and should be your
first resource for learning how to use it. But here are the rough strokes:
Lines that begin with . characters specify macros. For example, the .TL
macro sets up everything that follows as title text. The .AU macro
changes that to a different font for listing author names.
Unlike LaTeX, which uses begin and end statements to enclose sections,
groff uses macros to switch up the formatting until the next macro is
specified.
Everything has to be processed in one pass of the source document. This
means that tables, figures, etc. will all be positioned where you specify
them in the source, unlike LaTeX which will do its best to find a good
spot. (Also – groff won’t generate a bunch of temporary files and clutter
stuff up)
It’s a venerable and time-tested typesetting language should be a part of
anyone’s typesetting arsenal.
❧ I Hate Twitter and Yes, Thank You, I Did Take My Pills This Morning
Three weeks ago I pulled the plug on my Twitter account. Almost a year
ago I had fallen head over heels for that service, and when I finally cut the
cord I felt as if I had pulled my head out of the internet’s ass and took a
breath of fresh air.
Twitter is the cool kid’s table all over again. It’s a positive-feedback system
that perpetuates the position of those with the most followers and steals the
lunch money from the nerds. Sour grapes, I know. But hear me out.
Meet Devrin
Devrin just signed up for Twitter. Exciting! He posts a message:
Hello, world!
Not brilliant, but it’ll do for an inaugural post. Besides, he has so much to
tell the world! He’ll beguile the internet with his 140-character wit and steal
their hearts! He’ll have many thousands of followers!
Devrin recovers from his unbridled enthusiasm and ventures into Twitter’s deep
bowels. “I must find someone to follow,” he thinks. And follow he does: the
first victim is none other than John Gruber, Merlin Mann follows,
and Steven Frank is felled soon thereafter. Devrin goes to his timeline
and admires it. His post, alongside the likes of those three! He feels moved to
post again:
This is super cool!
And quits Twitter forever – or not. He should have. In reality it took 52
weeks and a few hundred “tweets” before he gathered the intestinal fortitude to
do it.
Some Background
When Twitter was released it garnered naught but harsh words. No one could see
any merit in a system that restricted you to 140 characters. “It’s a waste of
bandwidth,” they decided. And moved on.
Twitter exploded a little more than a year later. Suddenly bloggers decided
that Twitter was the Next Big Thing and rang up accounts. Their dutiful
readerships followed suit, creating accounts for the sole purpose – as I did
– of reading their favorite blogger’s tweets. And bloggers loved it: their
audience at their fingertips! Have a question? No problem! Tweet it and –
zing! – you’ve got answers!
The unexpected explosion melted Twitter’s servers. And now the Big Thing to
blog was Twitter’s unreliability: what kind of self-respecting service has
hours of downtime? And the slashdot effect increased Twitter’s problems. And
users. Today, a bit later, things seem to be running more stably.
Hey You Up There! Twitter Sucks!
Why did Twitter’s first reviewers find so much to hate? They had no audience.
Twitter was like writting letters and dropping them on the ground. A little
while later these same people were back on Twitter lavishing praise because –
neat! – all of sudden there were all these people that wanted to read this
crap you threw on the ground! Before 140 characters was an arbitrary
limitation, now it’s “inspired” and “revolutionary.” What the heck changed?
The people calling the shots now like Twitter. It’s this New Medium that
connects people. In reality it’s a big rat race for followers and favorited
tweets. Not for those on the top – for those on the bottom. The basement of
Twitter is one big Digg comment thread; it’s the usual mix of ep1c fa1lz, brb
bathroom, and OMG! coot puppys! It’s as if you could create a site that looked
like Facebook to the guys on top and MySpace to those beneath.
The only people that anyone follows – even the popular users – are the
popular users. Strange! This is how Twitter fails: there’s no way to discover
users. Sure, point me to the search bar. Point me to “follows” list on
everyone’s page. But there’s no way for me to find users that talk about the
same stuff I do. Twitter doesn’t put me in touch with these people. Twitter
doesn’t forge new connections, it only reproduces the blog & audience
relationship that already exists. That’s how Twitter fails. That’s why Twitter
sucks.
Twitter is your high school lunchroom where the uncool kids are too busy
peering over each other to get a glimpse of the cool table to even notice each
other.
One Year Later
And so a year later, thoroughly disillusioned, I pull the plug. Some will read
this and agree.
But the vast majority will – I feel – think, “You dolt. You can’t say you
don’t like Twitter and because of that Twitter as a whole sucks.” Point taken.
Maybe there are those that actually have friends on Twitter – how’d you swing
that? – or those that use it because they enjoy reading tweets from those on
high. Fair enough. But to the latter: that’s not what Twitter is for. They
provide RSS feeds if that’s all you want.
Too many people join Twitter because they hear that it’s awesome. This is to
you: don’t. It’s not.
❧ Is a Filesystem-Based Blog Right for You?
Chris pointed me to an insightful post by Chris Siebenmann on the
shortcomings of filesystem-based blogs. I’ll summarize his points:
Editing a post changes the modification time; this gets annoying when all you
want to do is fix a typo and not republish.
Embedding metadata is akward.
Storing metadata in a static location defeats the entire purpose of
abandoning a database-driven engine.
The best defense I can come up with is something like this: if you’re having
these problems with your file-based blog, you probably shouldn’t be using one.
I don’t think file-based blogs are superior to anything driven by a database;
in fact they’re pretty much dumber overall. If you need the metadata that
database-driven blogs provide you’re probably better off just using one rather
than trying to turn a file-based blog into something that it’s not.
I think file-based blogs shine when your blog can be better described as a
loose coupling of essays. I make no claims that my posts are worthy of being
labeled as such, but they are infrequent and permanent. My tumblr page is
what I reserve for musings and link-posting; this site is meant for posts that
I’d like everyone to be able to see for a while.
Nevertheless, I do believe there are a few simple tricks that you can pull to
adequately address some of Siebenmann’s points.
Modification Times
The solution I have is to include the post date along with the title at the top
of my posts. The first two lines of this post are:
Is a Filesystem-Based Blog Right for You?
6/8/2008
Can interprets the first line as the title – and formats a slug
accordingly – and the second line as the publication date. The post gets
published at noon of the day given and isn’t published if the date is in the
future. The modification time of the file has nothing to do with the
publication time.
Reminds me of how we used to title our homework assignments in grade
school.
Metadata
Siebenmann is concerned about a lot more than publication times: metadata can
include tags, categories, revisions, modification times, and so on. I don’t
think file-based blogs are cut out to handle gobs of metadata; if you find
yourself needing that data it’s probably time to move to something like
Wordpress.
I feel that Siebenmann’s last point – that having local storage for metadata
– is addressed by what I just said above.
Bonus Problem: Post-specific Media
File-based engines have no real way to store media in connection with a post. I
considered a few options:
Just don’t. Have a /media directory and hard-code in links in posts.
Can’t name two pieces of media the same thing (i.e. no foo.jpg in two
posts).
Turn each post into a folder with the post text file and any associated
media files inside. Just one problem: it’s a real pain in the butt. Kind of
defeats the purpose of being able to just drop posts into a directory and
publish them.
Create a folder for each post in a /media directory and have can
modify links in the post to point at this new location. I’ll explain this
one below.
Embrace the possibility of a blog without any media. Would do that, but I
already have posts with screenshots and such.
My initial post on can outlined what I was planning: basically to search
and replace links in the post source. To quote myself:
But I the approach I’d like is something like this:
-
Publishing script creates a directory per post in a specified media base
directory.
-
Drop any media corresponding to a particular post into said directory.
-
The post source uses a flag – something like
class='media' – in links
that reference local media. The publishing script looks for these in posts
and prepends the post’s media directory to the link.
The first two are right. The third point is crap. What I want is simplicity:
can uses Markdown, and including a class for a link means writing out the
link by hand. The solution I use is to just have links that look like:
<a href='/media/is_a_filesystem-based_blog_right_for_you.html/foo.jpg'>...</a>
Simple. During publishing can goes through and replaces that link with
this:
<a href='/media/is_a_filesystem-based_blog_right_for_you.html/post_slug/foo.jpg'>...</a>
And I drop foo.jpg into that folder to complete the process. Lets me use
identical names and, more importantly, it keeps the media folder nice and
organized. Still not as easy as database-based blogs, but thankfully I’m
text-heavy and tend not to have any media.
(Disclaimer: can actually doesn’t do any of this. But it will. Soon.)
Pick Your Poison
File-based blogs aren’t popular. For many they’re going to be a square peg in a
round hole. But if your blog isn’t complicated or updated often then they
offer a simplicity that something like Wordpress can’t.
❧ Can
Can is the blogging engine I’ve rolled for my site, with a generous tip ‘o the hat to Steven Frank. It’s almost unfair to call this an “engine”: really it’s just 100 lines of Python (and not very good Python, either). Whether that’s an indication of Python’s awesomeness or just my laziness I’ll leave to you. Either way it does what I need it to do:
-
Blog posts are just text files I’ve saved in a directory.
-
I use Markdown as input.
-
Running
python can.py publish spits out my entire blog in HTML files. No server-side processing.
It’s far from where I want it to be. The templating system sucks. There’s no good way of saving media for posts. It doesn’t spit out an RSS feed. But it has one thing going for it: it’s on Launchpad.net. Branch it, add junk, and merge it back in. Or don’t.
This means that I’ve (again) broken all URLs to my site. I’ve noticed that – because I hardly write anything – I can remove (almost) all unecessary cruft in the URLs. Before and after:
http://aneviltrend.com/blog/articles/2008/05/06/can
http://aneviltrend.com/archive/can.html
I’m still slapping myself for that /blog/articles bit of the URL. Tim Berners-Lee, the guy who coded the first internet browser, penned an excellent article on the art of beautiful URLs.
Where do I see can going from this point forward? The templating system needs an overhaul. I have three template files describing a base page layout with about one line of difference between each, in flagrant violation of DRY principles. The backend for the templates is hack as well: searching for known strings and using re.sub() to insert HTML. I feel that a cleaner approach would use Python’s built-in DOM support.
Media support just doesn’t exist. At this point my old posts with screenshots are pulling graphics from a temporary directory I set up. But I the approach I’d like is something like this:
-
Publishing script creates a directory per post in a specified media base directory.
-
Drop any media corresponding to a particular post into said directory.
-
The post source uses a flag – something like
class='media' – in links that reference local media. The publishing script looks for these in posts and prepends the post’s media directory to the link.
It seems a bit excessive – why not just hard code in the link? – but this approach seems to be the cleanest from a “writing the post” view: I don’t need to worry about what the generated slug will be (since that will likely be the name of the media directory) and it’s clean markup.
❧ Web Presence
I’ve been signing up for an alarming amount of web apps lately. Nearly every
site that I visit asks me to put down my name before it’ll let me in. And,
sucker that I am, I tend to use my real name.
spreading thin
Where am I on the net?
-
AIM
-
This site
-
Tumblr
-
Twitter
-
Google mail, docs, groups, etc.
-
Blogspot
-
Facebook
-
Last.fm, Pandora
-
Slashdot
-
Plan 9 & gEDA mailing lists
-
deviantART
-
Parallax: mobile desk chair project
Quite a list. I’m making no claims here though: you may have more or less. The
point is, though, that our every move on the web is captured. If I post to the
Plan 9 mailing list my Google will list that post as my top search result.
What if that post was a nasty reply? What if it was just plain stupid? That post
is archived by hundreds of sites. It’s not getting lost.
being careful
Granted it’s easy to simply not care. So what if a Google search of my name
turns up someone who trolls forums and pesters mailing lists? The
easy answer: that it really doesn’t matter. How many people search for me
online? How many would, if they saw those posts, even know me? Will these sites
even be around ten years down the road?
And if you don’t use your real name then that answer might suffice. The
anonymity of the internet makes it easy to be multiple people. But I’d like to
focus on those that are trying to cultivate a presence. Just like the “real
world” your name on the internet carries weight. It carries your image. And with
how prominent the internet has become it’s beginning to carry a significant
amount of your identity.
I believe that we take these online personas for granted. With every
web app that gets released we have another opportunity to create yet another
identity. Unlike our analog counterpart that forgets and is forgotten, the web
identity you create is permanent. The internet, cruel mistress that she is, will
never lose that terrifically embarrassing photo. Or video. Or post.
❧ A Cursory Look at Makefiles
New to Makefiles? At best they’re confusing, and at worst completely
incomprehensible. Here’s a dissection of a simple Makefile.
structure
The basic format of a Makefile follows this:
<targets> : <dependencies>
<commands>
Where multiple entries will make up a larger Makefile. A simple Makefile to
compile a helloworld.c program might look something like this:
helloworld.o : helloworld.c
gcc -o helloworld.o helloworld.c
The target is helloworld.o, and it depends on having helloworld.c. Running
the following command:
$ make helloworld.o
Will cause gcc to be run as specified in the Makefile.
practical makefiles
Developing a Makefile that might actually be used in a smallish software project
involves a bit more work. Generally speaking, the project will consist of
several – in this case – .c files, which will need to be linked in
interesting ways against each other.
%, $@, and $^ are all special variables. Here’s how they might be used:
%.o : %.c
gcc -o $@ $^
The % grabs the matching string from the target and applies it to the
dependency. If the target is foo.h, the Makefile will search for foo.c. The
next variable, $@, grabs whatever file matched the target. Likewise, %^
grabs the file(s) that matched the dependency.
Thus running
$ make helloworld.o
will, as above, run gcc -o helloworld.o helloworld.c. The % operator will
match “helloworld”, the $@ grabs helloworld.o, and the $^ grabs
helloworld.c.
Most Makefiles will also define some standard targets, such as clean:
clean :
rm -f *.o
That covers the basics. For additional resources on Makefiles check out:
❧ a variation on mips
Parallel ISA (PISA) Overview
Uses PC-indirect addressing to specify ‘registers.’ Meant to easily enable
out-of-order execution and multiple-issue logic in hardware. Otherwise exactly
like MIPS. Pronounced as “pizza.”
The modifications are to the source and destination registers. Source registers
are not addressed directly, but rather by providing the PC offset to the
instruction whose result will be used. The destination register field is
removed.
Example:
addiu $0, 5 ; x = 5
addiu $0, 4 ; y = 4
add -1, -2 ; adds x + y
Hardware can now easily determine that the first two loads can happen in
parallel – or out of order – but that the addition depends on the results of
the loads. The add cannot be issued until both loads complete. Essentially, the
ISA makes dependencies between instructions very clear.
Register File
Essentially a “cache” of registers. Because each entry needs to keep track of
the PC of the instruction that wrote it, the register file needs to keep a tag
record for each entry. This will incur a much higher hardware overhead in the
register file. A direct-mapped approach is used to reduce this penalty.
To support parallel operation two additional bits are needed for each entry:
-
Valid bit: because the register file is now essentially a cache, a valid bit
is needed for each entry to ensure that it is valid data.
-
Pending bit: when an instruction is issued the destination register entry
will have the pending bit set high. Any instruction down the line that
depends on this register entry will be stalled until the pending bit is
lowered again.
Non-sequential code
This approach works well for sequential code, but begins to break down when
loops and branches are used. Take this example:
addiu $0, 1
addiu $0, 5 ; x = 5
loop:
blez -1, done ; if( x==0 ) goto done
subi -2, -3 ; x = x-1
j loop
done:
The blez branch, because it only checks the result of the ldi 0x05
instruction, will never be taken. The subi instruction stores its result at a
+1 offset to the blez, which the branch does not check.
This issue gives us the following addition to the instruction set. The rd
field of the MIPS ISA, previously unused in this implementation, will now be
used to store an optional destination entry in the register file. The above
example can be rewritten as:
addiu $0, 1
addiu $0, 5, +2 ; x = 5, store into PC+2
loop:
blez +1, done ; if( x==0 ) goto done
subi 0, -3 ; x = x-1
j loop
done:
The compiler should assign the branch condition to inspect the result of the
last instruction within the loop to assign to the register in question. The last
instruction before the branch evaluation should be compiled to be written to
the same PC as the former instruction.
How does this affect the parallel operation of the processor? It should have no
effect on the operation and should require minimal additional hardware.
Previously, without the destination register field, instructions were
essentially writing to offset 0. All that has changed is that instructions can
now write to other offsets. The valid and pending bits will still ensure correct
parallel operation of superscalar implementations.
This approach has the disadvantage of being taxing on compilers. Whether this
proves to be a major issue or not will need to be seen.
Disadvantages
This approach suffers from another disadvantage: the inability to reference the
result of an instruction that is at a greater offset than N, where N is the
register file size. Because the register file uses the PC of an instruction to
store entries, an instruction more than an offset of N away from another
cannot use the result of the latter. This is an inherent limitation of the
instruction set.
A workaround is to write a value that must be accessed later to memory. This
might lead to an increased number of memory accesses throughout the program,
which will lead to decreased performance.
Another simple solution is to simply increase the size of the register file.
Though this would lead to increased hardware costs and might lead to longer
delays in register file reads, this would solve the problem somewhat.
Interestingly, the problem could also be alleviated by adding a second layer
register file cache, much like adding a second layer data cache. This would
offer the benefits of making register entries available for longer periods, but
has the disadvantage of making program flow harder to predict. A compiler would
need to keep track of the simulated cache state to determine if a register entry
will still be available at a later point in the program.
❧ The Assembler
My friend Chris and I recently finished our USB project. I was trying
to think of a good way to present this in a post, and decided to highlight one
small part of the project: the assembly that drives the lowest layers of the
software stack.
Old School
Last semester I had hacked up the assembly for our project without much
foresight or care for elegance. We were more preoccupied with trying to grasp
the 650-page spec that is USB, and beautiful code was the least of our
worries.
Though the code didn’t look good, it still had to be good, and here’s what
it had to accomplish:
-
Output one bit from a buffer in memory every 10 cycles (no more, no less).
-
Keep track of the number of bits sent and stop when that is equal to a given
bit count.
-
Every eight bits sent load another byte from memory.
-
Raise and lower the enable line so that the Mega32 can drive the bus when
transmitting and read the bus when receiving.
This becomes a tall task when you only have 9 cycles to work with (one cycle is
needed to output on the I/O pins). Let’s look at what the assembly I wrote last
semester for transmitting a packet looks like:
#define SIE_TOKEN_BIT
mov r20, r3
andi r20,0x01
add r20, r10
out %1, r20
lsr r3
subi r16, 1
brne .+4
jmp .sie_send_token_eop
/* Token: Send bit, and nop until next one */
#define SIE_TOKEN_BIT_NOP
SIE_TOKEN_BIT
NOP2
/* Token: Send bit, and buffer another byte */
#define SIE_TOKEN_BIT_BUFFER
SIE_TOKEN_BIT
ld r3, X+
The first thing to note is that all loops were unrolled in this assembly. To
send a byte, we had this macro:
#define SIE_TOKEN_BYTE
SIE_TOKEN_BIT_NOP
SIE_TOKEN_BIT_NOP
SIE_TOKEN_BIT_NOP
SIE_TOKEN_BIT_NOP
SIE_TOKEN_BIT_NOP
SIE_TOKEN_BIT_NOP
SIE_TOKEN_BIT_NOP
SIE_TOKEN_BIT_BUFFER
This would literally copy and paste in the assembly above about eight times. And
that wasn’t even the worst of it: because the loops were unrolled, we had to
copy in enough loop iterations for the worst case scenario. For the data packet
transmit code – which at most could send 103 bits – we had the byte macro
copied in 13 times. This equated to about 936 lines of code – for just one
small part of the SIE code. The sheer size of this hurt us; our code weighed in
at about 20 KB when compiled. On a device with only 32 KB of program flash
memory this becomes a bit of a problem (considering that our code was intended
as a companion library to an existing user program).
The assembly above shouldn’t be too confusing, but a few notes are in order.
-
r3 is used to store the byte buffered from memory.
-
r20 is used as a temporary register.
-
r10 holds the value 0x05.
-
%1 is a compiler directive for PORTA.
-
r16 holds the bit count for the given packet.
Each of the lines are explained in order:
mov r20, r3
Copies the buffer register into the temporary register.
andi r20,0x01
Performs an AND operation on the temporary register with a bit mask to
extract the lowest bit. This is the bit that will be sent on the bus.
add r20, r10
Adds the bit to be sent with 5: what this is essentially doing is
differentially encoding the signal and setting the enable pin high at the
same time. The pin assignments were: enable on pin 2, D+ on pin 1, and D- on
pin 0. Thus if the bit in the temporary register is 1, meaning that we
should be sending a differential 1, adding 5 will yield 0b00000110. The
enable line is set high, as is D+. D- is low.
out %1, r20
Outputs the value of the temporary register on PORTA.
lsr r3
Shifts the buffer register down one, getting it ready for when the next bit
will be sent.
subi r16, 1
Decrements the bit count by one.
brne .+4
When the bit count hits zero, this branch will not be taken.
jmp .sie_send_token_eop
If the above branch is not taken – meaning that all bits have been sent
– then the code will jump to the end-of-packet handler.
Repeat this seven times, and add a load instruction on the eighth, and you have
the complete workings of last semester’s assembly. It worked, true, but it was
gross; and with an entire semester to rework stuff I decided to sit down and
hammer out some nice code.
New School
We came into this semester knowing that USB with a Mega32 is indeed possible.
We also knew what USB was. We figured that a full code overhaul would be in
order, and there’s no better place to start than at the bottom.
The assembly, from above, was completely tossed. Little by little we came up
with our new assembly – replete with rolled-up loops and clever hacks. These
changes required some small modifications to the hardware, but nothing major.
Here’s the code that made it into our final revision:
#define TX_PACKET(label,mem_pointer,bit_count_reg)
mov r5, __zero_reg__
ldi r20, 0x01
/* buffer */
.sie_#label_tx_buffer:
ld r10, #mem_pointer+
/* bit tx */
.sie_#label_tx_bit:
lsl r10
rol r20
out %4, r20
/* completion checks */
dec #bit_count_reg
breq .sie_#label_tx_done
add r5, r3
brcs .sie_#label_tx_buffer
ldi r20, 0x01
rjmp .sie_#label_tx_bit
/* done */
.sie_#label_tx_done:
I could try to explain all the assembly here, but I’d be repeating an entire
chapter of our documentation on the project. Chapter 4 of the documentation
covers each line of the assembly and how it works. Check it out, or try and
figure out what the assembly is doing on your own. Should you choose to do that,
know that:
-
The pin assignments are: enable on pins 1 & 2 (these get ORed together into
one enable line) and transmit on pin 0 (this gets differentially encoded by
the hardware).
-
%4 is a compiler directive for the USB port.
-
#label, #mem_pointer, and #label are parameters passed to the macro
and are pasted into the macro where needed before the code is compiled.
-
r3 holds the value 0x20.
Explanation aside, the punchline is that nearly 1000 lines of assembly in our
previous project got replaced with just 12 lines of cleverness.
❧ Everyday Linux: rsync
I have two computers, both of which I like to listen to music on: my laptop, for
when I’m in class (ahem, studying), and my desktop when I’m in my room. I
use iTunes on my laptop and Amarok on my desktop. You can see that I might have a few
issues when syncing my music between the two computers. How I do it is today’s
Everyday Linux.
rsync
This is my typical scenario: I’m on campus, and I’ve just got some music from a
site like soul sides or Amazon’s sweet new music store. I import my
music and listen to it on iTunes. Later I get back to my room, and would like to
listen to my new music with the better speakers hooked up to my desktop. How do
I sync up my music?
I’m never typically near nor have access to my
desktop whenever I get new music. So I need a way to copy whatever music I have
over to my desktop. But why not just copy over the music manually? Well, I
could, but that’s not the cool way to do it.
The more practical reason is that iTunes on my laptop organizes my music as it
sees fit, and I’d rather not have to traverse arcane directory names in order to
get to the folder that I want to copy over. Amarok, fortunately, is much more
forgiving with its organization, and will put up with the structure that iTunes
uses. I also prune my music collection from time to time, and would rather not
have to track what changes I make to do the same on my desktop. In short: I need
rsync so I have a drop-dead simple syncing system for my music.
rsync is a utility that synchronizes files and directories. They can be two
local directories, two remote, one local and one remote, it doesn’t matter. All
it takes is one terminal command (given, you’ll probably spend some time
perfecting this command). The other caveat is that (if you’re using
rsync with remote computers) you’ll need to set up the rsync daemon on any
remote computers you connect to.
Implementation
The first step was to get my desktop set up for rsync. I created a music folder
in my home directory to begin, then set up the rsyncd.conf file in my /etc
directory.
To set up rsync on my Ubuntu system I followed along at the ubuntu guide
entry and at another excellent page. Here are some of the highlights:
The more exciting parts of this file look like this on my system:
[musicbackup]
path = <home>/music
comment = the music backup location
secrets file = /etc/rsyncd.secrets
Not bad at all. The rsyncd.secrets file is next:
<user>:<password>
For me it’s just one line: my name and password.
Now the fun part is on my laptop. This is the rsync command that I use to do a
one-way sync from my laptop to my desktop.
rsync --verbose --progress --stats --compress --rsh=/usr/bin/ssh \
--recursive --times --delete \
--exclude "Apple" \
--exclude "Movies" \
...
--exclude "Video" \
<home>/Music/iTunes/iTunes\ Music/* \
devrin@<desktop ip>:music
The options I specify include:
-
Keep me updated with
--verbose and --progress.
-
Minimize the bandwidth with
--compress.
-
Prune out removed directories with
--delete.
-
Traverse into each folder with
--recursive.
-
Don’t copy a few specific directories (in my case because they have video
files) by using
--exclude.
It would be a real pain to have to
type out this entire command each time I wanted to sync, so I copied it into its
own bash script that I called music_backup.sh. Now each time I want to back up
I just invoke my little script:
$ ./music_backup.sh
And my music gets synced up. Not bad! You’ll want to read up on the
documentation I linked to above to get a better feel for how to use rsync to
accomplish your goals. There’s a few steps there that I didn’t cover but that
should be fairly trivial to do.
All in all, rsync is a great system. It works perfectly for what I need it to
do, and I’m sure that a lot of people have some sort of syncing problem that
could be solved elegantly with rsync.