Git requires a steep learning curve and it helps tremendously, if you understand what it is doing under the covers. I think it took me a few months of trying it out here and there, until I finally had a revelation and everything started to make sense. This site did the trick for me…
That being said, what you want as a Git workflow is to present a few (lets say 1 to 5) commits as a Pull Request to the upstream repository. Those commits should ideally be based on top of the master commit on that upstream repository. So how do you get that?
Day 1
I fork the upstream repo on the GitHub website, then clone it locally to my disk. (At this point, there are 3 repositories in existence: The original “UPSTREAM” on GitHub, the forked “ORIGIN” on GitHub and the local “LOCAL” repository on my disk. All of these 3 repos are identical at this point - but they are 3 different instances, connected via various magical strings (UPSTREAM <-> ORIGIN is maintained by GitHub magic, ORIGIN<->LOCAL is maintained by a remote on my LOCAL repository). Also note that when I cloned your LOCAL, Git went ahead and checked out the “master” branch.
MentalPause: A branch in Git is a pointer to a commit. A branch just points to a commit. In C: typedef struct Commit_t *Branch_t; Branch_t *b = &myCommit; (make sure this sinks in, read the site mentioned in the first paragraph in order to understand this). /MentalPause
So there’s a branch in the “upstream” repository called “master”, which points to a commit A. We shall refer to this as upstream/master. There’s another branch in my origin repository that is also called “master” and that one points to the same commit A; lets call this one “origin/master”. And there’s one more branch, also called “master”, which resides on my disk and… that’s just plain old master. My local repository knows about the latter two right now (git branch -va). In order to “get” to the upstream one, I need to define a local remote that points to the upstream repository:
git remote add upstream git@github.com:diydrones/apm_planner.git
git fetch upstream
Day 2
I’ve decided to enhance the code base and implement feature “parachute”. I enter my local repository and start a new branch:
MentalPause: The new branch starts out his life pointing at the same commit that was checked out before I typed my command. Since I haven’t done anything since yesterday, this happens to be commit A, which my master points to. I know that’s also where origin/master points to, since I’m the only guy working on my repository on GitHub, but I actually don’t know if upstream/master also still points to Commit A. Maybe someone else worked on upstream and added a commit or two and then upstream/master is already ahead of me. Oh well, no worries, I’ll fix that later. /MentalPause.
I start modifying files to implement the feature. Turns out there are two sets of changes required to make this work. I decide to split these changes into two commits as well.
git add parachute.c parachute.h
git commit -m "P1: parachute implementation"
git add mavlink.c
git commit -m "P2: mavlink parachute command"
Where are our branches pointing at?
upstream/master -> ??
origin/master -> A
master -> A
feature_parachute -> P2
The local log looks like this: A -> P1 -> P2.
If we were to send this upstream (as a pull request), it wouldn’t necessarily work. Maybe there are changes up there that conflict with ours. So before sending, we need to make sure that my commits “apply”/“merge” properly. We do this, by “rebasing” our changes to the current upstream/master. First off, lets find out the current state of upstream:
Oh my, people have been busy. upstream/master points to commit D (the log shows A->B->C->D). Three commits have been added since yesterday. And one of them (commit C) modifies the file mavlink.c. This is bound to create conflicts. I better fix them now before submitting the pull request, which will only cause grief in its current form.
Tip: Git always works on the branch that is currently checked out.
So I have feature_parachute checked out. I now run this:
What this command does is pretty amazing. First off, we’re right now on commit P2. upstream/master points to D. Git will walk the history backwards and find the commit which P2 and D have in common (that’s A). It will then checkout A (and therefore throw out all the changes I have in P1 and P2). Then it starts applying B, C and D. Now the working directory looks like D. And here comes the amazing part: Git figures out the lines that changed in every file to transform A to P1. Mathematically speaking, that’s (P1 - A). It then adds that difference to the current folder (D). This will create P1’ (which is D + (P1 - A)). So the new commit P1’ is almost identical to the other P1, except it has a different parent. The parent of P1 is A, the parent of P1’ is D. So, strictly speaking, P1’ is not the same commit as P1. Even though the commit message is the same, the author is the same, the line changes in the files are the same … but the parent is different. So this is a brand-new commit. I will even call this a new letter: E. It looks like P1, but it isn’t. Make sure you understand this! Git then continues and takes the difference between P1 and P2 (P2-P1) and applies that to E in order to create … F. Again, superficially, F looks deceptively similar to P2. But it is a new, different, independent commit. Finally, Git takes your pointer (the branch called “feature_parachute”), which was pointing to P2 and makes it point to F. Phew.
TLDR: sorry, no shortcuts - understanding how Git works is the key to using it./TLDR
So git rebase re-creates all of “our” commits on top of the indicated branch and moves our current branch to the last of this brand-new commits. It doesn’t modify upstream/master itself. That’s just the destination of our commits.
Of course, the story above didn’t finish with a happy ending… there was a hiccup. Just as Git was about to apply the difference (P2-P1), it realizes that it conflicted with some changes introduced with commit C. So it stopped, spat out an error message and now I have to go in and fix it. This typically involves editing the file and removing the merge conflict markers and fixing the code. Once this was done, I continued the rebase operation with
Problem? If things get too hairy and the rebase just looks too disastrous. Don’t worry. There’s an emergency exit. While the conflicts pile up and there’s no way to fix them right now, I can type
and everything is back to where I started. I’m on branch feature_parachute, and it points to P2. The real P2. The grandson of A. /Problem?
I push this to my origin with
and use the browser UI to create the pull request. Send it off. Done.
Day 3
I have another great idea. Feature “backwards flight” is going to be awesome. Lets code it up.
But first. Wait. Yesterday I left off being on commit F (which looks just like P2) in branch ‘feature_parachute’. That is still in review and being torn apart. This new work should be independent of that. Lets start fresh. I want to start with the latest, bleeding edge commit available: whatever master is pointing at in upstream!
Now my local disk repo knows where upstream/master points to (turns out, the world was lazy and it still points at D - my pull request hasn’t been merged yet either).
By giving the checkout command a starting point for the new branch, it doesn’t start on commit F, which is currently checked out in my local clone. Instead, it goes back to D and then creates the new branch here.
Rinse repeat.
Day 4
I decided that it might be cooler to find out how Git works, than to write code. I spend the entire day figuring out exactly how I can rewrite my local history using ‘git rebase --interactive’.
Day 27
I should fly more.
Caveats:
I just wrote this big article with minimal checking on accuracy of spelling and command correctness. My main point is that you need to find out how Git works and then all these operations will start to make sense. I hope you had fun reading it though