http://webster.cs.ucr.edu/AsmTools/RollYourOwn/
From here you can download a load of PDFs that form a "compiler theory documentation". Read it! It's a bit long, but it's what I've been looking for. The author's page is:
http://www.scifac.ru.ac.za/compilers/
Yes, the source must be broken down into tokens. But then these tokens must be parsed into syntactical structures. Usually a syntax tree is made. The syntax tree is then checked (semantic checker), mostly this comprises type checking. From this tree intermediate code is generated (optional for small compilers but v. useful), optimized and converted to the code of the target machine.
I'm learning a lot!