About thin provisioning

ubuntu-serversAlso known as virtual provisioning, thin provisioning is defined as efficient managment of storage space on a SAN. The storage space is allocated on an as needed basis, and many midrange storage systems on the market support this feature. Allocation of storage is acquired when data is written: this requires a system that can serve up storage from a reserved pre-defined storage pool – capacity on demand. Some would think that this process would require greater administrative overhead and managing, but systems can do this type of thing on the fly – automatically. A product that supports this feature that I use in my shop is the EMC Celerra NAS appliance. The Celerra NS-G is a NAS gateway system that uses storage targets located on a CLARiiON CX-3 or CX-4 SAN. Carve out 2Tb on the CLARiiON and allocate it for the NAS, and the the 2Tb array can then be devided up into various file systems. One can then set up dynamic allocation on each of these file systems, if desired, and then assign a certain amount of storage for, as an example, user home directories. When the user home directories start getting full at a 90% utilization of the assigned file system, the NAS will dynamically allocate more storage from that 2Tb pool, usually in 5% increments, up to a pre-defined limit that is set by the administrator.

A green solution, thin provisioning can diminish the amount of powered-on and spinning disks that are sitting idle.

Advertisements

The Historical Development of Storage Networks

IBM7094In the 1960’s, computing was in the hands of government and scientific organizations, with a few large business enterprises using rudimentary data processing technology. In these early days of the computing establishment of organizational computing, storage was utilized centrally with the mainframe computers that used it. This was a very secure way of data storage and administration of that data was more streamlined than the way it was about to become with departmentalization, as we will see shortly. In the mainframe era, an application would consume all resources when it was used and site idle when there were no processes assigned. This developed the need for timesharing in which the idle system time was spent on other tasks – a tiered and more efficient approach to data processing, as this improved ROI and production numbers.

As with any organization, it is segmented into numerous departments, such as finance, development and research, technology, marketing, etc: and in the ’70s we saw the departmentalization of data in which each department would store its own data. This developed with the installation of microcomputers within the various departments, in place of the traditional terminal, which would access the back-end mainframe directly. While terminals were (and still are) in use, the microcomputer had started to create the segregated storage architecture with the organization.

Later in time, file sharing began to consolidate data for departments, and that data resided on departmental small or midrange servers. This was the first step in storage consolidation under a storage network. Server farms developed from this approach, but data was still departmentalized and that data still often resided on separate servers, which required more administration. With the advancements in data networking, applications were being developed now that incorporated use of data from numerous locations. Software was now developing to support the wide-area-network, which would transfer data across vast distances, thereby spreading data storage over a wider footprint – making administration of that data more complex.

Client-server computing, already in use as this time within the aforementioned departmentalized computing, was another significant step towards the present storage network; however this still included department data residing on separate hosting servers each with its own DAS – Direct Attached Storage. This method includes separate backup tape drives for each department and administration of it: enhancing administrative overhead.

With the proliferation of globalized data, the need for a centralized data store was once again being realized. This would ease the administration of data storage, backups, and especially streamline disaster recovery planning and execution. The SAN was the answer to an organization’s data storage consolidation and streamlined administration. Failure-tolerant external storage subsystems made it possible to share access to storage devices (Barker & Massiglia, 2002).From the SAN, administrators could backup and restore data quicker because of a less complex and segregated architecture, and applications access databases from a centralized repository.

Reference:
Barker, R., Massiglia, P. (2002). External Storage Subsystems: Software. Storage Area Network Essentials. John Wiley & Sons

SSD

ssdFlash-based (solid state) storage – While the price is still a bit high for large storage pools, it makes complete sense for some applications (On-Storage, 2009). Also known as SSD (Solid State Drive), flash disks are touted to be the fastest storage medium yet – a great improvement to RAID technology and a performance boost to databases. According to EMC, solid state drives (SSD) can store and query data faster than the magnetic disk drive, including the modern fiber channel and SATA II disk drives, and are also more energy efficient. SSD’s can amass a terabyte of data using 38 percent less power than conventional FC disk drives. “Since it would take thirty 15,000 RPM FC disk drives to deliver the same performance as a single flash drive, this translates into a dramatic 98 percent reduction in power consumption to achieve similar transaction per – second performance” (EMC, 2008). Many have been waiting a long time for the rise of this exciting solid state technology. Eventually, this could mean the start of the end of magnetic media within the enterprise data storage environment.

Flash memory, also called flash RAM, is nonvolatile memory that is used by erasing and reprogramming memory blocks as needed by applications. It is a variant of electrically erasable programmable read-only memory (EEPROM). EEPROM is written at the slower byte level – flash RAM uses the more efficient block level. There are two types of SSDs: RAM-based and flash memory based. It is the flash memory that is changing today’s enterprise storage. Although flash is not as fast as RAM-based, flash is still the choice over magnetic media within the data center for fast data arrays.

Performance is superior to the legacy magnetic disk, and because of this advancement, there has evolved a new tier of storage: Tier 0. According to SearchStorage.com, “Tier 1 storage, also known as production storage, can be considered the first class cabin for production data. Tier 2 and lower storage tiers were developed to handle data that is not quite as critical or does not need the performance characteristics of Tier 1 storage (Searchstorage.com, 2009)”. Now with SSD, a new tier is defined for performance ahead of what tier 1 can offer. Previously, tier 0 was defined as RAM disk, and this was an expensive endeavor which demanded excessive amounts of RAM within systems. Now that the cost of SSD has dropped (and continues to drop), it is more accessible and more companies are adopting the new SSD tier 0 solutions for enhancing performance within critical information systems applications.

The benefits of SSD can be most realized within database stores, as these arrays demand a high level of I/O. The significantly reduced seek time on SSD can produce a high ROI over time. Another advantage of SSD is that it contains ECC (error correcting memory) technology, and there is no danger of file fragmentation which can reduce administration labor and service cost. This author has seen service cases in which a system was running slow due to severely fragmented file systems residing on magnetic SCSI disks.

It is a popular subject in information systems: are SSD’s green? Do they consume less power than the older spindle-based disk arrays? The answer is no – if one is looking at a power per TB comparison; but when looking at an array with a high spindle count to get greater I/O, the SSD can consume less energy. This is true because with a high spindle count (more disk drives) the more energy is consumed (especially within non-virtualized arrays); but with SSD, less drives are needed to get the desired I/O performance. According to SearchStorage, “SSDs do not need extra spindles; they deliver high speed out of the box. The result is a lower number of devices and therefore a lowering of power consumption rates” (SearchStorage, 2009). As for performance comparison, research has found that “A typical hard disk drive performs 4- to 5-msec reads or writes and approximately 150-300 random I/Os per second. A RAM-based SSD does .015 msec reads and writes and about 400,000 I/Os per second. A flash-based SSD does about 0.2 msec reads and 2-msec writes. I/O performance is 100,000 random I/O per second on reads and 25,000 I/Os per second on writes.” (SearchStorage, 2009).

SSD storage technology is contributing faster I/O and greater energy savings to enterprises. While it has not replaced the magnetic spindle disk, it will replace it in the not so distant future. This will benefit storage area networks of all types – from the small SAN to the global SAN by providing superior response times within database arrays and critical appliations.

References:

PUNT-IT. (2008). EMC Brings Flash Drives, Virtual Provisioning and 1TB SATA to SymmetrixRetrieved April 4, 2010 from http://www.emc.com/collateral/analyst-reports/pund-it-review-symm-flash.pdf

Koopman, J. (2008). Top 5 Storage Trends. OnStorage. Retrieved May 20, 2009 from http://www.on-storage.com/50226711/top_5_storage_trends.php

SearchStorage.com. (2009). Solid State Storage. Retrieved May 25, 2009 from http://searchstorage.target.com

King, C., IT, Pund., Hill, D. (2009). EMC Brings Flash Drives, Virtual Provisioning, and 1TB SATA to Symmetrix. Breaking News Review. Volume 4, Issue 4. EMC. Retrieved May 18, 2009 from
http://www.emc.com/collateral/analyst-reports/pund-it-review-symm-flash.pdf

Dynamic allocation of storage within a WAN

floppy1Emerging technology such as dynamic allocation includes the ability of automatically expanding storage pools on demand: not just locally, but within the wide-area SAN. Administrator-defined file systems within storage systems connected over WAN links can work as one dynamic storage pool. To quote a DQ post from my last class: EMC is now offering this on their V-Max Symmetrix DMX systems. Dynamic storage allocation automatically scales storage on demand with reduced manual administration and “consolidates storage arrays into a single… system” (Symmetrix, 2009). Additionally, systems should dynamically detect connected data paths such as the connections between HBA’s and storage pools: this would reduce administrator time when it comes to manually defining worldwide names within zoning configuration and matching that information with the back end pools. A dynamic system would automatically detect such paths and present this to the administrator’s storage management application.

Reference:
Symmetrix V-Max. (2009). Consolidation of more workloads with the world’s most powerful networked storage. EMC. Retrieved June 25, 2009 from http://www.emc.com/products/detail/hardware/symmetrix-v-max.htm

Creating and Connecting to an iSCSI LUN on an EMC Celerra

From UTube.

EMC on Using the iSCSI Wizard for Celerra

A short how-to on adding iSCSI storage with EMC Celerra.

http://www.youtube.com/watch?v=zPqWWxnTwdI

ISNS and management of storage nets

From:
PAMMIDIMUKKALA, PRASAD. “ISNS eases management of storage nets.” Network World (Sept 1, 2003): 29. Academic OneFile. Gale. BCR Regis University. 31 July 2009

Internet Storage Name Service brings the plug-and-play capabilities of Fibre Channel to IP storage networks. ISNS facilitates automated discovery, management and configuration of iSCSI and Fibre Channel devices on a TCP/IP network. In a Fibre Channel fabric, a simple name server provides these services.

In any storage network, servers (or initiators) need to know which storage resources (or targets) they can access. One way to accomplish this is for an administrator to configure each initiator manually with its own list of authorized targets and configure each target with a list of authorized initiators and access controls. But this process is time-consuming and error-prone, and accidentally configuring multiple servers to access the same storage resources could be disastrous.

An Internet storage name server lets servers automatically identify and connect to authorized storage resources. Letting the servers dynamically adapt to changing storage resource membership and availability without human intervention results in even more efficiency.

Whereas a Fibre Channel storage name server can handle only Fibre Channel devices, iSNS can accommodate iSCSI devices and Fibre Channel devices via the Internet Fibre Channel Protocol. End nodes (initiators and targets) in an iSNS environment run a lightweight iSNS client that represents the host device to the iSNS server.

ISNS provides the following services:

* Name registration and discovery services – Targets and initiators register their attributes and address, and then can obtain information about accessible storage devices dynamically.

* Discovery domains and logon control service – Resources in a typical storage network are divided into groupings called discovery domains, which can be administered through network management applications. Discovery domains enhance security by providing access control to targets that are not enabled with their own access controls, while limiting the logon process of each initiator to a relevant subset of the available targets in the network.

* State-change notification service – The iSNS server notifies relevant iSNS clients of network events that could affect the operational state of storage nodes. Events such as storage resources going offline, discovery domain membership changes and link failure in a network can trigger state-change notifications. These notifications let a network quickly adapt to changes in topology, which is key to scalability and availability.

* Open mapping of Fibre Channel and iSCSI devices – The iSNS database can store information about Fibre Channel and iSCSI devices and mappings between the two in a multi-protocol environment. The mapped information is then available to any authorized iSNS client. This centralized approach is open and scalable instead of retrieving the mappings from individual iSCSI-FC gateways using proprietary mechanisms.

ISNS clients discover the iSNS server or servers using a variety of mechanisms, including Dynamic Host Configuration Protocol, Service Location Protocol and broadcast or multicast heartbeat messages. The iSNS framework allows for back-up iSNS servers that provide redundancy and failover.

ISNS servers also can store and distribute X.509 public-key certificates used for authenticating iSCSI storage nodes during the logon process.

By facilitating a seamless integration of IP and Fibre Channel networks, iSNS provides value to any storage network composed of iSCSI and/or Fibre Channel devices. The iSNS specification is on the standards track with the Internet Engineering Task Force IP Storage Working Group and is expected to be classified as a proposed standard soon.

Pammidimukkala is a director of product management for Nishan Systems and is the iFCP subgroup chair in the SNIA IP Storage Forum. He can be reached at prasad@nishansystems.com.

Source Citation:PAMMIDIMUKKALA, PRASAD. “ISNS eases management of storage nets.” Network World (Sept 1, 2003): 29. Academic OneFile. Gale. BCR Regis University. 31 July 2009