If you are working in Software Industry must hear about the two-term "Git and GitHub" but before jumping into Git first, you have to understand what version control is and what issue the software industry was facing before Git... So,
What is Version Control System?
A version control system is software that tracks changes to a file or set of files over time so that you can recall specific versions later. It also allows you to work together with other programmers.
The version control system is a collection of software tools that help a team to manage changes in a source code. It uses a special kind of database to keep track of every modification to the code.
Developers can compare earlier versions of the code with an older version to fix the mistakes.
Benefits
Enhances the project development speed by providing efficient collaboration.
Reduce possibilities of errors and conflicts meanwhile project development through traceability to every small change.
Employees or contributors of the project can contribute from anywhere irrespective of the different geographical locations through this VCS
For each different contributor to the project, a different working copy is maintained and not merged to the main file unless the working copy is validated.
Informs us about Who, What, When, and Why changes have been made.
Helps in recovery in case of any disaster situation
Types of VCS
1. Centralized Version Control Systems
2. Distributed Version Control Systems
Centralized Version Control Systems :
Centralized version control systems contain just one repository globally and every user needs to commit for reflecting one’s changes in the repository. Others can see your changes by updating.
CVCS has some drawbacks and the problem was,
a) It is not locally available, meaning you always need to be connected to a network to perform any actions.
b) Since everything is centralized, if Central Server gets failed, you will lose the entire data.
Central Repository or Central Server means, it is a kind of storage or folder in a remote server, where you or anyone can keep your code and can see that code or can access it.
Distributed Version Control Systems
In a DVCS, every developer has a full copy of the repository, including the entire history of all changes. This makes it easier for developers to work together, as they don't have to constantly communicate with a central server to commit their changes or to see the changes made by others.
Because developers have a local copy of the repository, they can commit their changes and perform other version control actions faster, as they don't have to communicate with a central server.
With a DVCS, developers can work offline and commit their changes later when they do have an internet connection. They can also choose to share their changes with only a subset of the team, rather than pushing all of their changes to a central server.
In a DVCS, the repository history is stored on multiple servers and computers, which makes it more resistant to data loss.
Git is an example of DVCS.
Difference between CVCS and DVCS
1. In CVCS, a client need to get a local copy of the source from the server, do the changes and commit those changes to the central source on the server, while in DVCS each client can have a local branch or repository as well and have a complete history on it. The client needs to push the changes to the branch which will then be pushed to the server repository.
2. CVCS systems are easy to learn and set up, and DVCS systems are difficult for beginners. Multiple commands need to be remembered.
3. Working on branches is difficult in CVCS, Developer often faces Merge Conflict. Working on branches is easier in DVCS, developers face less conflict.
4. CVCS systems do not provide offline access. DVCS systems are working fine in offline mode as a client copies the entire repository on their local machine.
5. CVCS is slower as every command needs to communicate with the server. while DVCS is faster as most user deals with a local copy without hitting the server every time.
6. If the CVCS server goes down, the developer is not able to do the work. But if the DVCS server is down, developers can work using their local copies.
What is Git?
Git is a version control system that allows you to track changes to files and coordinate work on those files among multiple people. It is commonly used for software development, but it can be used to track changes to any set of files.
With Git, you can keep a record of who made changes to what part of a file, and you can revert back to earlier versions of the file if needed. Git also makes it easy to collaborate with others, as you can share changes and merge the changes made by different people into a single version of a file.
It is used for:
Tracking code changes
Tracking who made changes
Coding collaboration
What does Git do?
Manage projects with Repositories
Clone a project to work on a local copy
Control and track changes with Staging and Committing
Branch and Merge to allow for work on different parts and versions of a project
Pull the latest version of the project to a local copy
Push local updates to the main project
What is GitHub?
GitHub is a web-based platform that provides hosting for version control using Git. It is a subsidiary of Microsoft, and it offers all of the distributed version control and source code management (SCM) functionality of Git as well as adding its own features. GitHub is a very popular platform for developers to share and collaborate on projects, and it is also used for hosting open-source projects.
Actual Workflow of Git
Initialize Git on a folder, making it a local Repository by the git init command. A hidden folder will create named .git
Clone a project to work on a local copy
Working directory and Staging area: this is where you see files physically and do modifications. At a time you can work on a particular branch. When we send our code to the staging area to finalize our code from the working space, this procedure is known as Add
In other CVCS, developers generally make modifications and commit their changes directly to the Repository (central). But Git uses a different strategy. Git does not track every modified file. Whenever you do commit an operation, Git looks for the files present in the staging area. Only those files present in the staging area are considered for commit and not all the modified files.
Commit: From the staging area when we send our codes to the local Repository this process will be known as Commit (save/snapshot). After saving the code, a commit id will be created & this is unique.
Commit Id: So if anyone needs to check any code later time, then they can check it through that commit id. Commit-ID is 40 alpha-numeric characters. It mainly uses the SHA-1 checksum concept. Even if you change one dot, the commit-id will get changed. It helps you to track the changes. Commit is also known as SHA-Hash.
Snapshot: It is that when you keep the codes into a file & when you change some code, eg:
you change into 4-5 lines then when you take a snapshot of that file, the snapshot will copy only that 4-5 lines of code in another file, not the entire code. So this helps to save less storage of a file.
Snapshot is incremental i.e. it will copy or save only the changed data.
Push: Push operation copies changes from a local Repository instance to a Remote or Central Repo(GitHub). This is used to store the changes permanently in the Git Repo.
Pull: Pull Operation copies the changes from a Remote Repo to a local machine. The pull operation is used for synchronization between two repo.
Branch: There is an important concept of the branch...
The diagram above visualizes a repository with two isolated lines of development. By developing them as branches, it's not only possible to work on both of them in parallel, but it also keeps the main Master (default branch) free from error.
Each task has one separate branch.
After done with the code, Merge other Branches with the Master.
This concept is useful for parallel development means at times many persons can work on their branch and that won't reflect on the main branch.
you can create any number of branches.
Changes are personal to that particular branch.
The default branch is Master.
Files created in Workspace will be visible in any of the branch workspaces until you commit. Once you commit, then that file belongs to that particular branch.
When creating a new branch, data from the existing Branch is copied to the new branch (only one time when the branch is created).
Tasks
1. Install git on your system
As am using Linux OS(Ubuntu ) so by default git is already installed in it ( Aws instance). My git version is 2.41.0
2. Create a free account in GitHub.
You can create a free account in GitHub by Signing up github.com (use this link).
3. Make a directory and make it a local repository by git init command
Created one directory as test-repository then changed the directory to it. Then made it a git repository with the git init command. A .git folder will be created that is hidden
4. Create a new repository on GitHub and clone it to your local machine
Created one repository in GitHub named Task-Repo and cloned it to my machine.
5. Make some changes to a file in the repository and commit them to the repository using Git
Cloned the repository that I created in GitHub by the URL in my machine.
configured git with git config --global user.name and user.email (this will help to check who is committed by name and mail id) command.
Now change to the repository that I cloned. Then created a file called devops.txt.
add the file by git add <filename> command and check the git status. This file is now staged. Now commit the file by the git commit -m "First commit"
6. Push the changes back to the repository on GitHub
Then I pushed the committed file into GitHub by the command git push origin main (as my default branch is main here, it can be master sometimes too).
While pushing the file into GitHub have to enter the GitHub username and Personal access token ( go to Developers settings in GitHub -> personal access token->generate token)
This is my GitHub Repository and the file(devops.txt) was pushed successfully.