A photo by Alexander Rotker. unsplash.com/photos/-sQ4FsomXEs

What is REST?

The Web has radically transformed the way we produce and share information. It has democratized information sharing and has revolutionized how information is created and shared. It has also been through major inflection points named only software engineers would do.

Web 1.0 – Was about static pages.

Web 2.0 – Is/was about responsive, dynamic content targeted for various form factors such as mobile and tablets. It has evolved primarily into a “Mobile First” movement. Web 2.0 is also largely about User generated content and Social Media

Web 3.0 –  is about being API First. It is about IOT devices , wearable’s, smart watches, Smart devices .Building applications to cater to humans and apps alike. It is about connective intelligence, connecting data, concepts, applications and ultimately people.

With this focus on “API first”, as developers we need to worry about “How consumable is the API ? Does it follow standards ? is it secure ? How do you handle versioning ?”

There are various architectural models for developing API First systems. such as REST, SOAP, CORBA, COM, RPC etc. One of the foremost models gaining a lot of ground currently is REST. REST stands for Representational State Transfer. Rest is a client server communication model which is stateless,  resource based, and Idempotent. As an architectural model REST defines Resources, Identifiers and state representations as basic blocks. In the real world REST uses HTTP as the protocol and Web servers as REST servers.

Resources – Resources are the fundamental building blocks of web-based systems. Anything that can be named can be a resource ( Person, Product, device, web page etc.). Almost anything can be modelled as a resource and then made available for manipulation over the network.

Resources Identifiers – With so many different resources, they all should be accessible via some identity and should be identified uniquely. The web provides an URI (Unique resource identifier) for this purpose.The relationship between URIs and resources is many-to-one.A URI identifies only one resource, but a resource can have more than one URI pointing to it. A URI takes the form <scheme>:<scheme-specific-structure>. The scheme defines how the rest of the identifier is to be interpreted.For example the http part of the URI http://themoviedb.org/movie/550 indicates that the rest of the URI must be interpreted according to the http scheme. It is the responsibility of the agent listening at themoviedb.org to understand the html scheme and locate the resource identified by the reminder of the URI. URL’s and URN’s are alternate forms of URI’s. A URI that identifies the mechanism by which a resource may be accessed is usually referred to as a URL. HTTP URIs are examples of URLs. This term is now obsolete since all URI’s now need to indicate  access protocol specific information.A URN is an URI with “urn” as the scheme and is used to convey unique names within a particular “namespace.” As an example a product can be uniquely identified by this URN urn:rfidic:product:core:ACME:sku:679769.

Resource Representations – Resources can have multiple representations. A representation is a machine-readable explanation of the current state of a resource. For example one application might represent a person as a customer with an XML representation,  while another might represent the same as an image of a person in jpeg and another as the persons voice sample in an mp3 format. A resource is an information representation of a real world object/process and as such may have many representations as compared to the real world. Access to a resource is always mediated by way of its representations.This separation between a resource and its representations promotes loose coupling between backend systems and consuming applications. It also helps with scalability, since a representation can be cached and replicated.The Web doesn’t prescribe any particular structure or format for resource representations. Representations can just as well take the form of an image, a video, a JSON document or a text file.This ecosystem of formats (which includes HTML for structured documents, PNG and JPEG for images, MPEG for videos, and XML and JSON for data), combined with the large installed base of software capable of processing them, has been a catalyst in the Web’s success.

Since a resource can have multiple representations , the client needs to negotiate and indicate the representation needed by the it. There are two ways of doing this.The first is content negotiation, in which the client distinguishes between representations based on the value of an HTTP header. Using content negotiation, consumers can negotiate for specific representation formats from a service. They do so by populating the HTTP Accept request header with a list of media types they’re prepared to process. It is ultimately up to the owner of a resource to decide what constitutes a good representation of that resource in the context of the current interaction, and hence which format should be returned.The resource type returned is also always specified in an HTTP response as one of the HTTP headers (Content-Type).The second is to give the resource multiple URLs—one URL for every representation.When this happens, the server that publishes the resource should designate one of those URLs the official or “canonical” URL.

 

Putting it together- How resources, identifiers and representations driver interactions.

On the web we need to act on objects and subjects represented by resources. These resources are acted upon Verbs provided by HTTP methods.The four main verbs of the uniform interface are GET, POST, PUT, and DELETE. These verbs are used to communicate between systems. HTTP also defines a set of response codes to respond back to verbs such as 301 Moved Permanently, 303 See Other, 200 OK, 201 Created, and 404 Not Found. Together verbs and status codes provide a general framework for operating on resources over the network. You can call GET on a service or resource as many times as you want with no side effects

git

GIT Basics

Git, affectionately termed “the information manager from hell” by its creator Linus Torvalds is a distributed version control system. It allows for fast scaling, has robust branching features and is an open source software. It is also a relatively new version control system released in April 2005. It was created to version control the Linux source code and as such was a much proven system at birth. GIT is optimized for distributed development, large file counts, complex merge scenarios and is blazingly fast.

Before we dive into using GIT we need to understand some basic concepts and understand how GIT is different from other VCS’s.

GIT Snapshots

Almost all version control systems store information as a list of file based changes. When a file is added to a version control system it is marked as a baseline. All further changes to this file are stored as delta differences to this base file over a period of time. This baseline of the file and the history associated with the changes is stored on a central server. The advantage of this system is that every developer knows what the others are changing and the administrator has a greater level of control on the centralized repository. This design requires that users need to be connected to the central server for history as well as to commit changes.

GIT follows a different design and stores changes as a series of snapshots of the complete folder structure. Every time a commit is made in GIT a snapshot of the folder at that point in time is saved and GIT maintains a reference of this snapshot as a SHA1 hash. During subsequent commits if some files in the snapshot have not changed then GIT only stores a reference to the same file from the previous snapshot. This prevents duplication of data that has not changed over snapshots. This makes GIT more like a file system with powerful  tools built on top of it. Most operations in GIT are local only and do not need any connection to other computers on the network.

GIT maintains integrity by check summing all changes. GIT creates checksums by calculating SHA-1 hashes of all objects ( files, folder structures etc.) A SHA-1 hash is a forty character string like 09182c053bb89b35c3d222ef100b48cff5e3d346.  You will see this everywhere when using GIT.

Repository

The repository is the place where the version control system keeps track of all the changes you make. You can think of a repository like a bank vault and its history like the ledger. To put it simply for GIT a repository is any directory with a .git folder.

Repositories can be centralized or distributed. Version control systems like CVS, Subversion, VSS follow the centralized repository model. All changes are maintained on a central repository and each developer sends out changes to the centralized repository after working on them locally. With a centralized repository developers have to look up to it for history. No history information is available locally and if disconnected, developers had only their latest version locally. It also required a consistent network connection to enable committing changes, creating branches etc. This is a critical issue. I have worked with VSS on a remote connection a decade back and have seen it take almost an hour to check out a single code file.  In a distributed version control system each team member has a local repository with a complete history of the project. Changes can be made and committed locally with a complete history of change available. Additionally a complete history and functionality is available even when disconnected from any origin sources.

A Git repository is a collection of refs—branches and .tags. (Branches are also known as heads.). A ref is a named mutable pointer to an object .A Git repository contains four kinds of objects,  a blob (file), a tree (directory), a commit (revision), or a tag. Every object is uniquely identified by a 40 hex digit number, which is the SHA-1 hash of its contents. Git makes it easier by having us reference only the first 5-7 characters of the hash to identify the object. Blobs and trees represent files and directories.Tags are named references to another object, along with some additional metadata. A commit object contains a tree id, zero or more parents (commit ids) ,an author (name, email, date) ,a committer (name, email, date) and a log message.

Working tree

A “working tree” consist of files that you are currently working on. This is the developers working directory and consists of changes that may or may not get promoted into the repository.

Stage or Index

An “index” is a staging area where new commits are prepared. It acts as an intermediary between a repository and a working tree. Changes made in the working tree will not be committed directly to the repository. They need to be staged on the index first. All changes residing on the index will be the ones that actually get committed into the repository. Changes in the index can also be un-staged or rolled back if necessary.

 

GIT Lifecycle

The general workflow for using GIT is as follows

  1. Initialize or clone a Git repository.
  2. Add or modify working copy by modifying/adding files.
  3. Stage the changes necessary.
  4. Review changes before commit.
  5. Commit changes.
  6. Finally push changes to remote repository as needed.

 

image

The Git Command Line

image

Basic  Commands

  • Git init – Create an empty Git repository in the current directory. By default it will have one branch named master.
  • Git Clone – Clone the Git repository from a source. This may be over HTTP, SSH, or the Git protocol, or it may be a path to another local repository.
  • Git add – Add file contents from work space to the index.
  • Git Commit – Store the changes (that are added) to the repository, using the index.
  • Git Status –  view the status of your files in the working directory and staging area
  • Git log –

Branching and merging

  • Git Branch  – List all the branches in your repo, and also tell you what branch you’re currently in.
  • Git commit – Commit any files you’ve added with git add, and also commit any files you’ve changed since then.
  • Git reset  – Drop all your local changes and commits, fetch the latest history from the server and point your local master branch at it
  • Git Push -Send changes to the master branch of your remote repository
  • Git Pull – Fetch and merge changes on the remote server to your working directory.
  • Git Clone – Create a working copy of a repository.
  • Git Merge – Merges a different branch into your active branch.

 

Online providers

  • GitHub –  is a site for online storage of Git repositories.  Many open source projects use it, such as AngularJS, React, Docker, Typerscript etc.
  • Altassian – is another online storage for GIT repository.

Docker Part 3 – Taking Docker for a spin on Windows.

To install Docker , head over to Download Docker for Windows if you are running windows 10 Pro or enterprise 64 bit and download the installer from this link.If you are on a previous version of Windows you would need to download  Docker Toolbox. Also you would need to enable Hyper-V virtualization to enable Docker to work. The setup installs the Daemon and the CLI tools. The Docker engine is installed as a service and you also have a helper on the task bar which can be used to further configure Docker. You can also download Kitematic from download.Docker.com to graphically manage Docker containers.

Once installed head over to a command prompt and type Docker –help to see the list of commands to interact with Docker.

Docker Options

For a quick sanity check run the command Docker version to check the version of Docker installed. As of writing this post my local instance of Docker was on version 1.12.0 for both the client and server components.

Lets start by searching for the official dotnet core image from Microsoft to load into a container. To search for the image we need to use the command

> Docker search microsoft/dotnet

image

 

This command returns the above list of dotnet images available from Microsoft. Next lets pull an image using the command Docker Pull and the name of the image to pull.

image

 

We can see that the Microsoft/dotnet image with the latest tag has been downloaded. Another interesting fact to note is the various layers being pulled and extracted asynchronously to create this image locally.

Now let us check the images available locally using the command Docker images

image

 

We have 3 images available locally including a redis and an Ubuntu image. Now for the first run fire the below command

image

This command loads the redis image into a container. The –d option runs the container in the background. Now let us run the below command to check on the running containers

image

The output indicates that there is a redis container running since the past four minutes with an entry point and a name. To get more details about this container we can use the docker inspect command which will return a json output with detailed information about the container such as IP address, volumes mounted etc.

We can use the docker logs command to view messages that the container has written to standard output or standard error.

image

Docker– Part 2 Docker Architecture

The Foundation

Before diving into how Docker is designed and built, it is better to understand the underlying technologies from which it is fleshed out of.

The concept of containers has been on the making for some time. Docker uses Linux namespaces and cgroups which have been part of linux since 2007.

CGroups cgroups or control groups provides a mechanism to place a resource limit on a group of processes in linux. These processes can be organized hierarchically as a tree of process groups, subgroups etc.Limits can be placed on shared resources such as CPU, Memory, network or Disk IO.CGroups also allow for accounting, checkpointing and restarting groups of processes.

NameSpaces Linux namespaces are used to create process that are isolated from the rest of the system without the need to use low level virtualization technology. This provides a way to have varying views of the system for different processes.

Union File System ( UFS) – The Union file system also called a Union Mount allows multiple file systems to be overlaid, appearing as a single files System to the user. Docker supports several UFS implementations AUFS, BTRFS, ZFS and others. The installed UFS system can be identified by running the Docker info command and checking the storage information. On my system the storage driver is aufs. Docker images are made up of multiple layers.. Each instruction adds a layer on top the existing layers. when a container is built , Docker adds a read write file system on top of these layers along with other settings.

The combination of the above technologies allowed the development of Linux Containers(LXC) which is a precursor to Docker.image

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Docker has two man components based on client server design communicating using HTTP. It also has an image store/repository.

Docker CLI – The Docker client or CLI is used to operate or control the Docker Daemon. The client may run on the container host or on remote client connected to the container host through http.

Docker Daemon – The Docker daemon is the middleware that runs, monitors and orchestrates containers.It works with the Docker CLI to build, ship and run containers. The user interacts with the Docker daemon through the Docker clinet

Docker Registry – This is a registry of images. It contains images, layers and metadata about the images.   Docker hub is a public registry which hosts thousands of public images.

 

image

Docker–Part 1

Image result for docker  

What is Docker ?

Docker is a container technology for packaging an application along with all its dependencies . Docker did not invent containers , rather it made container technology simpler and easier to use.

Difference between Docker and Virtual machine.

Traditionally applications have been deployed in Virtual machines in production. This provides isolation and scalability since we can spin off multiple VM’s on a single machine when needed.While providing advantages VM’s also bring their own share of problems. .A virtual machine provides you with virtual hardware on which the operating system and the necessary dependencies must be installed. This takes a long time and running a complete copy of the OS and dependencies requires a significant resource overhead. Each virtualized application includes not only the application and its dependencies but the entire guest operating system.

Unlike virtual machines docker containers do not virtualize hardware but run in the user space. Additionally while a Virtual machine runs on a separate guest OS , docker containers run within the same OS Kernel.Docker helps avoid this by using the container technology built into the operating system. Thus docker provides resource isolation and allocation benefits of VM’s while being lean , efficient and portable. Docker also ensures a consistent environment for an application across any system, be it development, staging or a production system. This enables developers to develop and test their programs against a single platform and a single set of dependencies. This puts an end to the age old developer refrain of ”Hey!!, It works on my machine”.

In a nutshell the aim of a VM is to fully emulate an OS and a hardware stack while the fundamental goal of a container is to make an application portable and self contained.

 

image

 

Container

A container is a logical unit where you can store and run an application and all its dependencies. A container holds an image. An image is a bundled snapshot of everything needed to run a program inside a container. Multiple containers can run copies of the same image. Images are the shippable units in the docker ecosystem. Docker also provides an infrastructure for managing images using registries and indexes. Docker hub (https://hub.docker.com/)is an example of such a registry. it is a public registry of Images. Individuals and organizations can create private repositories holding private images as well.

The Docker platform primarily has two system, a Container engine which is responsible for container lifecycle and a hub for distributing images. Additionally we also have Swarm which is a clustering manager, Kitematic which is a GUI for working with containers and many other tools which complement the Docker ecosystem.Many more of these are getting built by the community largely making this  a much more robust and fast evolving platform.

 Why is Docker important ?

The fast start up time of Docker containers and the ability to script them has resulted in Docker taking off especially within the developer community more so as it enabled fast iterative development cycles along with portability and isolation. With DevOps becoming a standard all across Docker fits in nicely in this methodology allowing teams to manage thier build and deployment cycles at a much more faster and predictable pace.

Multiversion Concurrency Control

Every developer I know of has used or uses databases on a daily basis, be it SQL Server or  Oracle but quite a few of them are stumped when asked how this SQL will execute and what would its output be.

insert into Employee select * from Employee; Select * from Employee;

Additionally what happens if multiple people execute this statement concurrently. Will the select lock the insert ? Will the insert lock the select ? How does the system maintain data integrity in this case ? The answers have ranged from, this is not a valid SQL statement to this will be an indefinite loop.

Some have also got it accurately but without an understanding of what happens in the background to be able to provide a consistent and accurate response to the above query. The situation is further complicated by the fact that the implementation which solves the above conundrum is referred to differently in the various database and transactional systems.

A simple way of solving the above problem is by using locks or latching. Using locks ensures data integrity but it also serializes all reads and writes. This approach is definitely not scalable since the lock will only allow a single transaction either read or write to happen. Locking has further evolved with ready only locks and other variations but is still inherently a concurrency nightmare. A better approach to effectively ensure data integrity and also ensure scalability is by using   Multiversion concurrency control pattern.

Multiversion Concurrency involves tagging an object with read and write timestamps. This way an access history is maintained for a data object. This timestamp information is maintained via change set numbers which are generally stored together with modified data in rollback segments. This enables multiple “point in time” consistent views based on the change set numbers stored.MVCC enables read-consistent queries and non-blocking reads by providing this “point in time” consistent views without the need to lock the whole table. In Oracle this change set number are called System change numbers commonly referred to as SCN’s.This is one of the most important scalability features for a database since this provides for maximum concurrency while maintaining read consistency.

Internet of things–IOT

Microsoft recently announced that Visual Studio will start supporting device development and started by initially supporting Intel’s Galileo board.This is not entirely new to Visual Studio. One of the earliest add on’s to support board based development was visual micro which supported the Arduino board.

Arduino is an open-source physical computing platform based on a simple i/o board and a IDE integrated development environment that implements the Processing/Wiring language. Arduino can be used to develop stand-alone interactive objects and installation or can  be connected to software on your computer. The first Arduino board was produced on January 2005. The project is the result of the collaboration of many people primarily David Cuartielles and Massimo Banzi who taught physical computing. The others would be David Mellis, who built software engine , Tom Igoe and Gianluca Martino.

 

image

Board Layout

image

 

Starting clockwise from the top center:

• Analog Reference pin (orange)

• Digital Ground (light green)

• Digital Pins 2-13 (green)

• Digital Pins 0-1/Serial In/Out – TX/RX (dark green)

– These pins cannot be used for digital i/o (digitalRead and digitalWrite) if you are also using serial communication (e.g. Serial.begin).

• Reset Button – S1 (dark blue)

• In-circuit Serial Programmer (blue-green)

• Analog In Pins 0-5 (light blue)

• Power and Ground Pins (power: orange, grounds:light orange)

• External Power Supply In (9-12VDC) – X1 (pink)

• Toggles External Power and USB Power (place jumper on two pins closest to desired supply) – SV1 (purple)

• USB (used for uploading sketches to the board and for serial communication between the board and the computer; can be used to  power the board) (yellow)

Installing Arduino software on Your Computer :

To program the Arduino board, you must first download the development environment (the IDE) from www.arduino.cc/en/Main/Software. Choose the right version for your operating system. Post this you need to install the drivers that allow your computer to talk to your board through the USB port.

Installing Drivers: Windows

Plug the Arduino board into the computer; when the Found New Hardware Wizard window comes up, Windows will first try to find the driver on the Windows Update site.Windows Vista will first attempt to find the driver on Windows Update; if that fails, you can instruct it to look in the Drivers\FTDI USB Drivers folder.You’ll go through this procedure twice, because the computer first installs the low-level driver, then installs a piece of code that makes the board look like a serial port to the computer. Once the drivers are installed, you can launch the Arduino IDE and start using Arduino.

clip_image001

Arduino Programming language

The basic structure of the Arduino programming language is fairly simple and runs in at least two parts. These two required parts, or functions, enclose blocks of statements.

void setup()
{
    //statements
}
void loop()
{
     //statements
}

Where setup() is the preparation, loop() is the execution. Both functions are required for the program to work.

The setup function should follow the declaration of any variables at the very beginning of the program. It is the first function to run in the program, is run only once, and is used to set pinMode or initialize serial communication.

The loop function follows next and includes the code to be executed continuously – reading inputs, triggering outputs, etc. This function is the core of all Arduino programs and does the bulk of the work.

Functions

A function is a block of code that has a name and a block of statements that are executed when the function is called.

Custom functions can be written to perform repetitive tasks and reduce clutter in a program. Functions are declared by first declaring the function type. This is the type of value to be returned by the function such as ‘int’ for an integer type function. If no value is to be returned the function type would be void. After type, declare the name given to the function and in parenthesis any parameters being passed to the function.

type functionName(parameters)

{

statements;

}

Variables

A variable is a way of naming and storing a value for later use by the program. As their namesake suggests, variables can be continually changed as opposed to constants whose value never changes.A variable can be declared in a number of locations throughout the program and where this definition takes place determines what parts of the program can use the variable.

Variable scope

A variable can be declared at the beginning of the program before void setup(), locally inside of functions, and sometimes within a statement block such as for loops. Where the variable is declared determines the variable scope, or the ability of certain parts of a program to make use of the variable.

A global variable is one that can be seen and used by every function and statement in a program. This variable is declared at the beginning of the program, before the setup() function.

A local variable is one that is defined inside a function or as part of a for loop. It is only visible and can only be used inside the function in which it was declared. It is therefore possible to have two or more variables of the same name in different parts of the same program that contain different values. Ensuring that only one function has access to its variables simplifies the program and reduces the potential for programming errors.