hadoop fs -ls

hadoop fs -ls

Usage: hadoop fs -ls [-d] [-h] [-R] [-t] [-S] [-r] [-u] <args>
Example: 
hadoop fs -ls /user/hadoop/dir1 /user/hadoop/dir2
hadoop fs -ls /user/hadoop/dir1/filename.txt
hadoop fs -ls hdfs://<hostname>:9000/user/hadoop/dir1/
hadoop fs -ls /user/hadoop/file1
Options:
-d: Directories are listed as plain files.
-h: Format file sizes in a human-readable fashion (eg 64.0m instead of 67108864).
-R: Recursively list subdirectories encountered.
-t: Sort output by modification time (most recent first).
-S: Sort output by file size.
-r: Reverse the sort order.
-u: Use access time rather than modification time for display and sorting.
Have You Know About Nmap Commands if No Learn Nmap Here : Nmap Tutorial 
hadoop fs -appendToFile

hadoop fs -appendToFile

Usage: hadoop fs -appendToFile <localsrc> ... <dst>
hadoop fs -appendToFile localfile /user/hadoop/hadoopfile
hadoop fs -appendToFile localfile1 localfile2 /user/hadoop/hadoopfile
hadoop fs -appendToFile localfile hdfs://nn.example.com/hadoop/hadoopfile
hadoop fs -appendToFile - hdfs://nn.example.com/hadoop/hadoopfile Reads the input from stdin.
Command hadoop fs -appendToFile Append single source, or multiple sources from local file system to the destination file system. Also reads input from stdin(standard Input/Output) and appends to destination file system.


Hadoop examples jar

Hadoop examples jar

Comman Question asked that Where can I find the Hadoop examples.jar file ? Or Hadoop map reduce example jar???

Answer:under /usr/local/lib as follows (replace 4.2.1 with specific cloudera version you are using):

mrv1: /usr/local/lib/hadoop_mr1/hadoop-examples-2.0.0-mr1-cdh4.0.1.jar mrv2: /usr/local/lib/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.0-cdh4.2.1.jar

So we can run it as: hadoop jar /usr/local/lib/hadoop_mr1/hadoop-examples-2.0.0-mr1-cdh4.0.1.jar pi 500 40

HDFS (Hadoop Distributed File System)

HDFS(Hadoop Distributed File System) is a file system designed for storing very large files with streaming data access patterns,running on clusters of commodity hardware.

HDFS is a filesystem written in Java
--Based on Google’s GFS
Sits on top of a native filesystem
--ext3, xfs etc
Provides redundant storage for massive amounts of data
--Using cheap, unreliable computers
HDFS performs best with a ‘modest’ number of large files
--Millions, rather than billions, of files
--Each file typically 100Mb or more
Files in HDFS are ‘write once’
--No random writes to files are allowed
HDFS is optimized for large, streaming reads of files
--Rather than random reads


How Files Are Stored
Files are split into blocks.
Data is distributed across many machines at load time
--Different blocks from the same file will be stored on different machines
--This provides for efficient MapReduce processing
Blocks are replicated across multiple machines, known as DataNodes
--Default replication is three-fold
  – i.e., each block exists on three different machines
A master node called the NameNode keeps track of which blocks make up a file, and where those 
--blocks are located
--Known as the metadata

How Files Are Stored:Example

NameNode holds metadata for the data files.
DataNodes hold the actual blocks
--Each block is replicated three times on the cluster


NameNode holds metadata for the data files. DataNodes hold the actual blocks --Each block is replicated three times on the cluster

HDFS:Point To Note
When a client application wants to read a file:
--It communicates with the NameNode to determine which blocks make up the file,and which DataNodes those blocks reside on
--It then communicates directly with the DataNodes to read the data

When a client application wants to read a file: --It communicates with the NameNode to determine which blocks make up the file,and which DataNodes those blocks reside on --It then communicates directly with the DataNodes to read the data

Big Data Analysis With HDFS

HDFS Concepts: Blocks, Replicas,Namenode, Datanode
NameNode manages the File system Namespace

HDFS Concepts: Blocks, Replicas,Namenode, Datanode

HDFS ARCHITECTURE

HDFS ARCHITECTURE

Command line interface

Hdfs File Read

Command line interface-hdfs file read

Hdfs File Write

Command line interface-hdfs file write


Start-up process
-Namenode enters Safemode
  --Replication does not occur in Safemode
-Each Datanode sends Heartbeat 
-Each Datanode sends Blockreport
  --Lists all HDFS data blocks
-Namenode creates Blockmap from Blockreports
-Namenode exits Safemode
-Replicate any under-replicated blocks
Checkpoint process
-Performed by Namenode
-Two versions of FsImage
   --One stored on disk
   --One in memory
-Applies all transactions in EditLog to in-memory FsImage
-Flushes FsImage to disk
-Truncates EditLog
Namenode memory concern
For fast access Namenode keeps all block metadata in-memory
--The bigger the cluster - the more RAM required
--Best for millions of large files (100mb or more) rather than billions
--Will work well for clusters of 100s machines
Hadoop 2+
--Namenode Federations
--Each namenode will host part of the blocks
--Horizontally scale the Namenode
--Support for 1000+ machine clusters
--Yahoo! runs 50,000+ machines
For more detail visit Apache Hadoop
Namenode’s fault tolerance
Namenode daemon process must be running at all times
--If process crashes then cluster is down
Namenode is a single point of failure
--Host on a machine with reliable hardware (ex. sustain a diskfailure)
--Usually is not an issue
Hadoop 2+
--High Availability Namenode
--Active Standby is always running and takes over in case main namenode fails
--Still in its infancy
Source:HDFS

Internet of Things (IOT)

Content
  1. Introduction to IoT
  2. Evolution of IoT
  3. Why IoT?
  4. General Requirements 
  5. Communication Features 
  6. Technologies Involved
  7. Applications
Introduction to IoT Evolution of IoT Why IoT? General Requirements  Communication Features  Technologies Involved Applications
What’s the Internet of Things
--Internet of Things (IoT) is a computing concept which provides interconnection between the uniquely identifiable devices. 
--By integrating several technologies like actuators and sensor networks, identification and tracking technology, enhanced communication protocol and distributed intelligence of smart objects, IoT enables communication between the real time objects present around us.
--From any time ,any place connectivity for anyone, we will now have connectivity for anything!
--Internet of Things (IoT) is a computing concept which provides interconnection between the uniquely identifiable devices.  --By integrating several technologies like actuators and sensor networks, identification and tracking technology, enhanced communication protocol and distributed intelligence of smart objects, IoT enables communication between the real time objects present around us. --From any time ,any place connectivity for anyone, we will now have connectivity for anything!
IOT Structure

History
--In 1997, “The Internet of Things” is the seventh in the series of ITU Internet Reports originally launched in 1997 under the title “Challenges to the Network”.
--1999, Auto-ID Center founded in MIT
--2003, EPC Global founded in MIT
--2005, important technologies of the internet of things was proposed in WSIS conference.
--2008, First international conference of internet of things: The IOT 2008 was held at Zurich.

Cisco’s Prevision about IoT

--Cisco’s Prevision about In 2008 the number of things connected to the Internet was greater than the people living on Earth.
--Within 2020 the number of things connected to the Internet will be about 50 billion.

Evolution of Internet of Things

History --In 1997, “The Internet of Things” is the seventh in the series of ITU Internet Reports originally launched in 1997 under the title “Challenges to the Network”. --1999, Auto-ID Center founded in MIT --2003, EPC Global founded in MIT --2005, important technologies of the internet of things was proposed in WSIS conference. --2008, First international conference of internet of things: The IOT 2008 was held at Zurich.  Cisco’s Prevision about IoT --Cisco’s Prevision about In 2008 the number of things connected to the Internet was greater than the people living on Earth. --Within 2020 the number of things connected to the Internet will be about 50 billion.


Evolution of Internet of Things  report
Gartner Report

Why Internet of Things?
--Dynamic control of industry and daily life
--Improve the resource utilization ratio 
--Better relationship between human and nature
--Forming an intellectual entity by integrating 
human society and physical systems
--Flexible configuration, P&P…
--Universal transport & internetworking
--Accessibility & Usability? 
--Acts as technologies integrator 

Visions of Internet of Things
 Visions of Internet of Things
IoT General Requirements
IoT General Requirements

IoT Communication Features
IoT Communication Features

Technologies Involved

--Communication
--Backbone
--Hardware
--Protocols
--Software
--Data Brokers/Cloud
--Platforms
--Machine Learning
Technologies Involved --Communication --Backbone --Hardware --Protocols --Software --Data Brokers/Cloud --Platforms --Machine Learning

Communication
Technologies Involved --Communication --Backbone --Hardware --Protocols --Software --Data Brokers/Cloud --Platforms --Machine Learning
RFID
RFID-A radio-frequency identification system uses tags, or labels attached to the objects to be identified. Two-way radio transmitter-receivers called interrogators or readers send a signal to the tag and read its response.
-RFID tags can be either passive, active or battery assisted passive.
-Frequency: 120–150 kHz (LF), 13.56 MHz (HF), 433 MHz (UHF)
EnOcean-Range: 10cm to 200m





EnOcean
-ISO/IEC14543-3-10 (Alliance)
-A The EnOcean technology is an energy harvesting wireless technology used primarily in building automation systems; but is also applied to other applications in industry, transportation, logistics and smart homes
-Frequency: 315 MHz, 868 MHz, 902 MHz
-Range: 300m Outdoor, 30m Indoors
EnOcean

NFC
-ISO/IEC18092 and ISO/IEC 14443-2,3,4
-NFC is a set of short-range wireless technologies, typically requiring a distance of 10 cm or less.
-NFC always involves an initiator and a target; the initiator actively generates an RF field that can power a passive target.
-Frequency: 13.56 MHz
-Range: < 0.2 m
NFC
Bluetooth
-Bluetooth is a wireless technology standard for exchanging data over short distances (using short-wavelength radio transmissions in the ISM band.
-Frequency: 2.4GHz
-Range: 1-100m
Bluetooth
WiFi (Alliance)
-The Wi-Fi Alliance defines Wi-Fi as any "wireless local area network (WLAN) products that are based on the Institute of Electrical and Electronics Engineers' (IEEE) 802.11 standards.
-Frequency: 2.4 GHz, 3.6 GHz and 4.9/5.0 GHz
-Range: Common range is up to 100m but can be extended.
WiFi (Alliance)
Weightless (SIG)
-Weightless is a proposed proprietary open wireless technology standard for exchanging data between a base station and thousands of machines around it using White space with high levels of security.
Frequency:Varies with legislation (470 – 790MHz)
Range: Up to 10km
Data Rates: 1kbits/s to 10Mbits/s
Weightless (SIG)

GSM (Association)
-GSM (Global System for Mobile communications) is an open, digital cellular technology used for transmitting mobile voice and data services.
-Frequency:Europe:900MHz & 1.8GHz , US: 1.9GHz & 850MHz
-Data Rates: 9.6 kbps
GSM (Association)

Additional: 
3G  
4G LTE 
Dash7 
Ethernet 
GPRS 
PLC / Powerline 
QR Codes, 
EPC 
WiMax 
X-10 
802.15.4 
Z-Wave 
Zigbee 

Backbone
IPv6
--Internet Protocol version 6 (IPv6) is the latest revision of the Internet Protocol (IP), the communications protocol that provides an identification and location system for computers on networks and routes traffic across the Internet.
--IPv6 uses a 128-bit address, allowing 2128, or approximately 3.4×1038 addresses, or more than 7.9×1028 times as many as IPv4, which uses 32-bit addresses.
UDP and TCP
--With UDP, computer applications can send messages, in this case referred to as data-grams, to other hosts on an Internet Protocol (IP) network without prior communications to set up special transmission channels or data paths.
--The Transmission Control Protocol (TCP) is intended for use as a highly reliable host-to-host protocol between hosts in packet-switched computer communication networks.
6LoWPAN
--6LoWPAN is an acronym of IPv6 over Low power Wireless Personal Area Networks. 
--The 6LoWPAN group has defined encapsulation and header compression mechanisms that allow IPv6 packets to be sent to and received from over IEEE 802.15.4 based networks.
--It contain issues such as small packet sizes, low bandwidth, low power, large volumes of devices, unreliability from radio connectivity issues, battery drain, device lockups, and physical tampering.

Hardware
-Wireless SoC (system on chip)
-Self-contained,RF-certified module solutions that have TCP, UDP and IP on chip.
-Manufactures:  Gainspan, Wiznet, Nordic Semiconductor, TI
-Prototyping boards and platforms
--Arduino
--Raspberry Pi
--BeagleBone Black
-These are communities and prototyping platforms available that are making its possible to create your own Internet of Things project.
Sensors
-Sensors are used to obtain measurements of physical parameters such as the presence of certain biological entities (biosensors), wavelengths of light (image sensors), and flow velocity (thermal flow sensors) etc.

Software
Riot OS
-RIOT OS is an operating system for Internet of Things (IoT) devices. It is based on a microkernel and designed for energy efficiency, hardware independent development, a high degree of modularity
Riot OS

ThingsSquare Mist
-The Thingsquare Mist is open source firmware exceptionally lightweight, battle-proven, and works with multiple microcontrollers with a range of radios.
ThingsSquare Mist


Protocols
CoAP
-Constrained Application Protocol (CoAP) is an application layer protocol that is intended for use in resource-constrained internet devices, such as WSN nodes.

MQTT
-Message Queue Telemetry Transport (MQTT) is an open message protocol for M2M communications that enables the transfer of telemetry-style data in the form of messages from pervasive devices, along high latency or constrained networks, to a server or small message broker.

XMPP
-The Extensible Messaging and Presence Protocol (XMPP) is an open technology for real-time communication.
-It powers a wide range of applications including instant messaging, presence, multi-party chat, voice and video calls, collaboration, lightweight middleware, content syndication, and generalized routing of XML data.

RESTful HTTP
-Representational State Transfer (REST) is a style of software architecture for distributed systems such as the World Wide Web. REST has emerged as a predominant web API design model.

Data Brokers/Cloud Services
ThingWorx
-It provides a complete application design, runtime, and intelligence environment - allowing organizations to rapidly create M2M applications
ThingWorx\
EVRYTHNG
-The EVRYTHNG Engine provides high scale, industrial technology to create and serve millions of Active Digital Identities™ for a company’s products and other objects. These unique online profiles create a persistent, unique digital presence for any physical object on the Web.   
EVRYTHNG
Sense
-Open.Sen.se an open platform for all those who want to imagine, prototype and test new Devices, Installations, Scenarios, Applications for this globally interconnected and immersive world.
Sense
Grok Engine
-Grok is software that breaks this bottleneck with three unique capabilities: a high level of automation in analyzing streaming data, the ability to learn continuously from data, and the ability to drive action from the output of Grok's data models.
Grok Engine

Characteristics of Most Relevant Standardization Activities
Characteristics of Most Relevant Standardization Activities

Middleware Architecture of IoT


SOA based architecture for IoT middleware

Technology Roadmap of Internet of Things
Technology Roadmap of Internet of Things

Applications of IoT
Applications of IoT
Management
Retail
Food
Education
Pharmaceuticals
Security
Transport and Logistics
Smart Cities
Smart Manufacturing
Daily life and domotics
Management Retail Food Education Pharmaceuticals Security Transport and Logistics Smart Cities Smart Manufacturing Daily life and domotics
Management
-Data Management
-Waste Management
-Urban Planning
-Production Management
Management -Data Management -Waste Management -Urban Planning -Production Management

Retail
-Intelligent Shopping
-Bar Code in Retail
-Electronic Tags
Retail -Intelligent Shopping -Bar Code in Retail -Electronic Tags
-Intelligent tags for drugs
-Drug usage tracking
-Enable the emergency treatment to be given faster and more correct
Pharmaceuticals -Intelligent tags for drugs -Drug usage tracking -Enable the emergency treatment to be given faster and more correct

FOOD
-Control geographical origin
-Food production management
-Prevent overproduction and shortage
-Control food quality, health and safety. 
FOOD  -Control geographical origin -Food production management -Nutrition calculations -Prevent overproduction and shortage -Control food quality, health and safety.
EDUCATION
-School Administration
-Attendance Management
-Voting System
-Automatic Feedback 
-Instructional Technology
-Media 
-Information management
-Foreign language learning
EDUCATION -School Administration -Attendance Management -Voting System -Automatic Feedback  -Instructional Technology -Media  -Information management -Foreign language learning

AUTOBOT
-Diagnostics service for cars
-Alerts relatives in case of an accident
-Discovery service of car position
-Integrated with several web services
AUTOBOT -Diagnostics service for cars -Alerts relatives in case of an accident -Discovery service of car position -Integrated with several web services
Transportation
-ConLock
-ContainerSafe
-Integration of light sensors GPS and GSM
Transportation -ConLock -ContainerSafe -Integration of light sensors GPS and GSM
Smart Cities
-Residential E-meters
-Smart street lights
-Pipeline leak detection
-Traffic control
-Surveillance cameras
-Centralized and integrated system control
Smart Cities -Residential E-meters -Smart street lights -Pipeline leak detection -Traffic control -Surveillance cameras -Centralized and integrated system control
Smart Manufacturing
-Flow optimization
-Real time inventory 
-Asset tracking
-Employee safety
-Predictive maintenance
-Firmware updates
Smart Manufacturing -Flow optimization -Real time inventory  -Asset tracking -Employee safety -Predictive maintenance -Firmware updates

Daily Life and Domotics
Daily Life and Domotics-iot
Source:IOT