I had a short presentation on build process and C program startup for embedded projects. Below are the slides from presentation with some additional info added for the purpose of this post.
In general I wanted to point out that most C compilers work in same fashion. Showcase examples below are based on GCC for Cortex M. Others compilers have equal functionality that only differs in syntax or naming and you can achieve same things with them.
Name "Compiler" for a toolset is a bit misleading. A toolset is a set of tools used for building a project and one of the steps is called compilation. It is done with compiler sub part. Image presents most vital parts of toolset (light green), again, badly marked as Compiler in lower left corner.
Overall the complete development environment is shown with emphasis that Editor, Compiler and Debugger are separate entities. They are usually sold as one IDE. But you can use each from different vendor if you wish. Write code in Visual Studio, compile with make and GCC and debug in Eclipse over GDB on OpenOCD. Executable file has the debug info (if not intentionally stripped in build process) so that debugger IDE can pick up source files specified in it. Such setups are complex to set and it's best to stick to simple all-in-one solutions.
Compiler marks where address of external symbol is needed. Linker populates the address after all pieces are put together and absolute locations can be calculated.
Build process in example is made with two calls to compiler converting the source files to object files and one call to linker gluing them together. Well known 'make' also does this. It's just that it uses more complex/flexible syntax to describe with what parameters which utils are called and in what order. IDEs also do this internally. But the parameters are defined with clickable options in dialogs.
Few GCC info points as reference to sample:
Short comment to -ffunction-sections and --gc-sections option pair: size optimization with unused code deletion is not straight forward. Source files are not thrown at compiler executable all at once and processed together. They are processed one by one. Functions can't be just removed if not used inside single source file - they can be called from some other source files. So compiler can't do it. And linker knows nothing about functions. Trick here is to instruct compiler to tag each function and prepare it as separate chunk as input to linker. Linker can then figure out which chunks are not accessed and omit them from final image.
MEMORY layout tells the (GCC) linker where the memories are on target platform. This differs from device to device.
SECTIONS entries instruct linker where to put chunks of "program" generated by compiler. By default (GCC) compiler tags code (functions) with ".text", constants with ".rodata", initialized global variables with ".data" and uninitialized global variables with ".bss". Own tags can be used as shown. Tagged chunks are called 'compiler output sections' or 'linker input sections' and are not to be confused with 'linker output sections'.
There is a problem with not-loadable programs. Loadable programs like on PC where OS loads executable in memory and executes it have global variables initialized with loading, i.e. initial global variable values are written to variable locations when loading - the program is built that way. But nothing loads embedded programs. They are usually stored in nonvolatile (read only) memory and start executing at power on. So what about the initial values for global variables? They can't just appear in (RAM) memory out of thin air. RAM by definition holds random values at power on. But RAM is where variables are! The trick is to store initial values somewhere in nonvolatile memory and copy them to locations of global variables immediately after power on. GCC enables this with 'AT>' command in linker file. Global variables are linked to addresses in RAM but their initial values stored where 'AT>' points. Other compilers also do this, you just don't realize that they add this so called 'startup code' to your project and insert call to it somewhere soon after reset.
This is most minimal C project. It would be a single source file project if Some.h, SSomeStruct and SomeCall wouldn't be used.
Usually build process generates some summary of what was built. How it looks differs from compiler to compiler but in general information is what is present in final image, where it is and what is the size of it. It is commonly referred to as 'map' file.