Sunday, January 26, 2020

Comprehensive Study on Big Data Technologies and Challenges

Comprehensive Study on Big Data Technologies and Challenges Abstract: Big Data is at the heart of modern science and business. Big Data has recently emerged as a new paradigm for hosting and delivering services over the Internet. It offers huge opportunities to the IT industry. Big Data has become a valuable source and mechanism for researchers to explore the value of data sets in all kinds of business scenarios and scientific investigations. New computing platforms such as Mobile Internet, Social Networks and Cloud Computing are driving the innovations of Big Data. The aim of this paper is to provide an overview of the concept Big Data and it tries to address various Big Data technologies, challenges ahead and possible. It also explored certain services of Big Data over traditional IT service environment including data collection, management, integration and communication Keywords— Big Data, Cloud Computing, Distributed System, Volume I. INTRODUCTION Big Data has recently reached popularity and developed into a major trend in IT. Big Data are formed on a daily bases from Earth observations, social networks, model simulations, scientific research, application analyses, and many other ways. Big Data is a data analysis methodology enabled by a new generation of technologies and architecture which support high-velocity data capture, storage, and analysis. Data sources extend beyond the traditional corporate database to include email, mobile device output, sensor-generated data, and social media output. Data are no longer restricted to structured database records but include unstructured data. Big Data requires huge amounts of storage space. A typical big data storage and analysis infrastructure will be based on clustered network-attached storage. This paper firstly defines the Big Data concept and describes its services and main characteristics. â€Å"Big Data† is a term encompassing the use of techniques to capture, process, analyze and visualize potentially large datasets in a reasonable timeframe not accessible to standard IT technologies. II. Background Need of Big Data Big Data refers to large datasets that are challenging to store, search, share, visualize, and analyze the data. In Internet the volume of data we deal with has grown to terabytes and peta bytes. As the volume of data keeps growing, the types of data generated by applications become richer than before. As a result, traditional relational databases are challenged to capture, share, analyze, and visualize data. Many IT companies attempt to manage big data challenges using a NoSQL database, such as Cassandra or HBase, and may employ a distributed computing system such as Hadoop. NoSQL databases are typically key-value stores that are non-relational, distributed, horizontally scalable, and schema-free. We need a new methodology to manage big data for maximum business value. Data storage scalability was one of the major technical issues data owners were facing. Nevertheless, a new brand of efficient and scalable technology has been incorporated and data management and storage is no longer the problem it used to be. In addition, data is constantly being generated, not only by use of internet, but also by companies generating big amounts of information coming from sensors, computers and automated processes. This phenomenon has recently accelerated further thanks to the increase of connected devices and the worldwide success of the social platforms. Significant Internet players like Google, Amazon, Face Book and Twitter were the first facing these increasing data volumes and designed ad-hoc solutions to be able to cope with the situation. Those solutions have since, partly migrated into the open source software communities and have been made publicly available. This was the starting point of the current Big Data trend as it was a relatively cheap solution f or businesses confronted with similar problems. Dimensions of Big Data Fig. 1 shows the four dimensions of Big Data. They are discussed below. Fig. 1 Dimensions of Big Data Volume refers that Big Data involves analyze huge amounts of information, typically starting at tens of terabytes. It ranges from terabytes to peta bytes and up. The noSQL database approach is a response to store and query huge volumes of data heavily distributed. Velocity refers the speed rate in collecting or acquiring or generating or processing of data. Real-time data processing platforms are now considered by global companies as a requirement to get a competitive edge. For example, the data associated with a particular hash tag on Twitter often has a high velocity. Variety describes the fact that Big Data can come from many different sources, in various formats and structures. For example, social media sites and networks of sensors generate a stream of ever-changing data. As well as text, this might include geographical information, images, videos and audio. Veracity includesknown data quality, type of data, data management maturity so that we can understand how much the data is right and accurate 000,000,000,000,000,000,000 bytes Big Data Model The big data model is an abstract layer used to manage the data stored in physical devices. Today we have large volumes of data with different formats stored in global devices. The big data model provides a visual way to manage data resources, and creates fundamental data architecture so that we can have more applications to optimize data reuse and reduce computing costs. Types of data The data typically categorized into three differ ­ent types – structured, unstructured and semi-structured. A structured data is well organized, there are several choices for abstract data types, and references such as relations, links and pointers are identifiable. An unstructured data may be incomplete and/or heterogeneous, and often originates from multiple sources. It is not organized in an identifiable way, and typically includes bitmap images or objects, text and other data types that are not part of a database. Semi-structured data is orga ­nized, containing tags or other markers to separate semantic elements, III. Big Data Services Big Data provides enormous number of services. This paper explained some of the important services. They are given below. Data Management and Integration An enormous volume of data in different formats, constantly being collected from sensors, is efficiently accumulated and managed through the use of technology that automatically categorizes the data for archive storage. Communication and Control This comprises three functions for exchanging data with various types of equipment over networks: communications control, equipment control and gateway management. Data Collection and Detection By applying rules to the data that is streaming in from sensors, it is possible to conduct an analysis of the current status. Based on the results, decisions can be made with navigation or other required procedures performed in real time. Data Analysis The huge volume of accumulated data is quickly analyzed using a parallel distributed processing engine to create value through the analysis of past data or through future projections or simulations. IV. BIG DATA TECHNOLOGIES Internet companies such as Google, Yahoo and Face book have been pioneers in the use of Big Data technologies and routinely store hundreds of terabytes and even peta bytes of data on their systems. There are a growing number of technologies used to aggregate, manipulate, manage, and analyze big data. This paper described some of the more prominent technologies but this list is not exhaustive, especially as more technologies continue to be developed to support Big Data techniques. They are listed below. Big Table: Proprietary distributed database system built on the Google File System. This technique is an inspiration for HBase. Business intelligence (BI): A type of application software designed to report, analyze, and present data. BI tools are often used to read data that have been previously stored in a data warehouse or data mart. BI tools can also be used to create standard reports that are generated on a periodic basis, or to display information on real-time management dashboards, i.e., integrated displays of metrics that measure the performance of a system. Cassandra: An open source database management system designed to handle huge amounts of data on a distributed system. This system was originally developed at Face book and is now managed as a project of the Apache. Cloud computing: A computing paradigm in which highly scalable computing resources, often configured as a distributed system provided as a service through a network. Data Mart: Subset of a data warehouse, used to provide data to users usually through business intelligence tools. Data Warehouse: Specialized database optimized for reporting, often used for storing large amounts of structured data. Data is uploaded using ETL (extract, transform, and load) tools from operational data stores, and reports are often generated using business intelligence tools. Distributed system: Distributed file system or network file system allows client nodes to access files through a computer network. This way a number of users working on multiple machines will be able to share files and storage resources. The client nodes will not be able to access the block storage but can interact through a network protocol. This enables a restricted access to the file system depending on the access lists or capabilities on both servers and clients which is again dependent on the protocol. Dynamo: Proprietary distributed data storage system developed by Amazon. Google File System: Proprietary distributed file system developed by Google; part of the inspiration for Hadoop3.1 Hadoop: Apache Hadoop is used to handle Big Data and Stream Computing. Its development was inspired by Google’s MapReduce and Google File System. It was originally developed at Yahoo and is now managed as a project of the Apache Software Foundation. Apache Hadoop is an open source software that enables the distributed processing of large data sets across clusters of commodity servers. It can be scaled up from a single server to thousands of clients and with a very high degree of fault tolerance. HBase: An open source, free, distributed, non-relational database modeled on Google’s Big Table. It was originally developed by Powerset and is now managed as a project of the Apache Software foundation as part of the Hadoop. MapReduce: A software framework introduced by Google for processing huge datasets on certain kinds of problems on a distributed system also implemented in Hadoop. Mashup: An application that uses and combines data presentation or functionality from two or more sources to create new services. These applications are often made available on the Web, and frequently use data accessed through open application programming interfaces or from open data sources. Data Intensive Computing is a type of parallel computing application which uses a data parallel approach to process Big Data. This works based on the principle of collection of data and programs used to perform computation. Parallel and Distributed system that work together as a single integrated computing resource is used to process and analyze Big Data. IV. BIG DATA USING CLOUD COMPUTING The Big Data journey can lead to new markets, new opportunities and new ways of applying old ideas, products and technologies. Cloud Computing and Big Data share similar features such as distribution, parallelization, space-time, and being geographically dispersed. Utilizing these intrinsic features would help to provide Cloud Computing solutions for Big Data to process and obtain unique information. At the same time, Big Data create grand challenges as opportunities to advance Cloud Computing. In the geospatial information science domain, many scientists conducted active research to address urban, environment, social, climate, population, and other problems related to Big Data using Cloud Computing. V. TECHNICAL CHALLENGES Many of Big Data’s technical challenges also apply to data it general. However, Big Data makes some of these more complex, as well as creating several fresh issues. They are given below. Data Integration Organizations might also need to decide if textual data is to be handled in its native language or translated. Translation introduces considerable complexity — for example, the need to handle multiple character sets and alphabets. Further integration challenges arise when a business attempts to transfer external data to its system. Whether this is migrated as a batch or streamed, the infrastructure must be able to keep up with the speed or size of the incoming data. The IT organization must be able to estimate capacity requirements effectively. Companies such as Twitter and Face book regularly make changes to their application programming interfaces which may not necessarily be published in advance. This can result in the need to make changes quickly to ensure the data can still be accessed. Data Transformation Another challenge is data transformation .Transformation rules will be more complex between different types of system records. Organizations also need to consider which data source is primary when records conflict, or whether to maintain multiple records. Handling duplicate records from disparate systems also requires a focus on data quality. Historical Analysis Historical analysis could be concerned with data from any point in the past. That is not necessarily last week or last month — it could equally be data from 10 seconds ago. While IT professionals may be familiar with such an application its meaning can sometimes be misinterpreted by non-technical personnel encountering it. Search Searching unstructured data might return a large number of irrelevant or unrelated results. Sometimes, users need to conduct more complicated searches containing multiple options and fields. IT organizations need to ensure their solution provides the right type and variety of search interfaces to meet the business’s differing needs. And once the system starts to make inferences from data, there must also be a way to determine the value and accuracy of its choices. Data Storage As data volumes increase storage systems are becoming ever more critical. Big Data requires reliable, fast-access storage. This will hasten the demise of older technologies such as magnetic tape, but it also has implications for the management of storage systems. Internal IT may increasingly need to take a similar, commodity-based approach to storage as third-party cloud storage suppliers do today. It means removing rather than replacing individual failed components until they need to refresh the entire infrastructure. There are also challenges around how to store the data whether in a structured database or within an unstructured system or how to integrate multiple data sources. Data Integrity For any analysis to be truly meaningful it is important that the data being analyzed is as accurate, complete and up to date as possible. Erroneous data will produce misleading results and potentially incorrect insights. Since data is increasingly used to make business-critical decisions, consumers of data services need to have confidence in the integrity of the information those services are providing. Data Replication Generally, data is stored in multiple locations in case one copy becomes corrupted or unavailable. This is known as data replication. The volumes involved in a Big Data solution raise questions about the scalability of such an approach. However, Big Data technologies may take alternative approaches. For example, Big Data frameworks such as Hadoop are inherently resilient, which may mean it is not necessary to introduce another layer of replication. Data Migration When moving data in and out of a Big Data system, or migrating from one platform to another, organizations should consider the impact that the size of the data may have. To deal with data in a variety of formats, the volumes of data will often mean that it is not possible to operate on the data during a migration. Visualisation While it is important to present data in a visually meaningful form, organizations need to consider the most appropriate way to display the results of Big Data analytics so that the data does not mislead. IT should take into account the impact of visualisations on the various target devices, on network bandwidth and on data storage systems. Data Access The final technical challenge relates to controlling who can access the data, what they can access, and when. Data security and access control is vital in order to ensure data is protected. Access controls should be fine-grained, allowing organizations not only to limit access, but also to limit knowledge of its existence. Enterprises therefore need to pay attention to the classification of data. This should be designed to ensure that data is not locked away unnecessarily, but equally that it doesn’t present a security or privacy risk to any individual or company. VI. CONCLUSION This paper reviewed the technical challenges, various technologies and services of Big Data. Big Data describes a new generation of technologies and architectures, designed to economically extract value from very large volumes of a wide variety of data by enabling high-velocity capture. Linked Data databases will become more popular and could potentially push traditional relational databases to one side due to their increased speed and flexibility. This means businesses will be able to change to develop and evolve applications at a much faster rate. Data security will always be a concern, and in future data will be protected at a much more granular level than it is today. Currently Big Data is seen predominantly as a business tool. Increasingly, though, consumers will also have access to powerful Big Data applications. In a sense, they already do Google and various social media search tools. But as the number of public data sources grows and processing power becomes ever faster and c heaper, increasingly easy-to-use tools will emerge that put the power of Big Data analysis into everyone’s hands.

Friday, January 17, 2020

Digital Literacy Making Us Smarter

Technology has had its significant effects to society and it is slowly changing how people live nowadays. There is no doubt that it has made lives easier, at times simpler, but this does not mean that it always has positive outcomes. One of the examples that technology has bring about negative effects to society is how it is affecting literacy and how people appreciate reading in the traditional sense.Author Christine Rosen, in her work entitled â€Å"People of the Screen,† indicates that technology has now allowed people to replace books with electronic readers and the Internet so much so that traditional printed books might become a thing of the past. The thought of digital literacy replacing print literacy is alarming because it means depending too much on technology when the need to replace it is not that significant. While technology is definitely making people more capable, there is a question whether it does make them smarter.Screen reading is definitely different from traditional reading even though some people may agree to this. â€Å"By contrast, screen reading, a historically recent arrival, encourages a different kind of self-conception, one based on interaction and dependent on the feedback of others. It rewards participation and performance, not contemplation† (Rosen â€Å"People of the Screen†). Screen reading, thus, makes people smarter regarding technology and the different skills it needs to work.Screen reading requires people to look at monitors, push buttons, and scroll mouses over. It requires people to know how to navigate the devices, programs, or softwares to participate. â€Å"Screen reading allows you to read in a â€Å"strategic, targeted manner,† searching for particular pieces of information† (Rosen â€Å"People of the Screen†). However, there is question if this type of reading really does stimulate their minds and instills in them what they have just read on the screen.Screen reading is en tirely different from the traditional reading because it allows the reader to imagine and let his or her mind work actively while reading. â€Å"You enter the author’s world on his terms, and in so doing get away from yourself. Yes, you are powerless to change the narrative or the characters, but you become more open to the experiences of others and, importantly, open to the notion that you are not always in control† (Rosen â€Å"People of the Screen†).In addition, books enhance the readers' reading experience because it is tangible and allows the readers to turn the pages, feel its thinness or thickness, and see for themselves how far along they are from finishing it. While books are bulky, there is a great feeling of seeing them stacked together, especially in libraries, and see first-hand how much a person has collected over the years of reading. People should decide whether they want to replace digital literacy with print literacy.â€Å"Literacy, the most e mpowering achievement of our civilization, is to be replaced by a vague and ill-defined screen savvy. The paper book, the tool that built modernity, is to be phased out in favor of fractured, unfixed information. All in the name of progress† (Rosen â€Å"People of the Screen†). Digital literacy is important because of the significant role that technology is playing in people's lives today but this does not mean that it is better than the traditional way. While it makes people adapt to the changing of times, it certainly does not make them smarter or more literate.

Thursday, January 9, 2020

Aging, Gender, Education And Modernization - 1866 Words

Life expectancy across the globe has increased during the last century resulting in a rise in the proportion of older population, especially in developed countries. This brings with it many challenges concerning the welfare of the elderly, as they need food, shelter, health care and security (Sung K, 2004). It becomes important to study how different cultures perceive old age as these attitudes determine behaviors towards the old (Yun and Lachman, 2006). The main objective of this literature review is to find out as to what are the perceptions about ageing in different cultures and how do they influence the treatment meted out to the older people in a society. Moreover, this review also aims to describe how these perceptions towards old†¦show more content†¦There are two aspects of old age that researchers have generally addressed, first, perceptions about getting old, second, respect for elders or how the elderly are treated in a certain culture. Studies have identified res pect as a major influence on the quality of life in old age (Sung, 2004; Yun and Lachman, 2006) both within the family and the society at large (Sung, 2004). In East Asian cultures aging is associated with a higher respect and honor in the family and the community (Sung,2004; Yun and Lachman, 2006) which derives from Confucius principles of filial piety (Sung,2004; Yun and Lachman, 2006; Sung and Dunkle,2009). However, longer life expectancy has affected these traditional cultures and as the elders become more frail and ill they are neglected and mistreated (Sung and Dunkle, 2009). Moreover, social changes in east asia have unfavourably affected the older generation (Yun and Lachman, 2006), these include urbanization, more women working outside their homes, difference in the education level of the old and the young generations, and the loss of the joint family system (Yun and Lachman, 2006). Researchers (Sung, 2004; Sung and Dunkle, 2006) also argue that it is important to find out the young adults attitudes towards the elderly in a society, as they are the ones forming the support system for the old generation. Sung (2004) explored different forms of respect for elders shown by young adults

Wednesday, January 1, 2020

The Color Of Our Skin Daren - 1643 Words

The color of our skin daren’t portray the lives that we live on a daily basis, that is what’s wrong with today’s society. Prior to 9/11 law enforcement officials had been using racial profiling on a daily basis in their efforts to combat crime. With the attacks on 9/11 an enemy that had previously been invisible became very much a reality. A reality that needed to be dealt with immediately using the only tools that were available at the moment. Just because racial profiling was semi-effective doesn’t make it right. It’s what makes it wrong. What is terrorism? Terrorism is the use of threat or violence especially as a means of forcing others to do what one wishes. Terrorism is real and it comes in many forms, even you may not know it is terrorism; it happens constantly all around the world every day. There are many groups that practice terrorism a few in particular that our government has to deal with on a daily basis to make sure the people in our country stay safe. Like for example the Muslims, Arabs, and Israeli have groups that are highly trained and will use deadly force no matter the computation. The tactics they use could be a lot of things for example like cyber-war, piracy and suicide. They have people literally willing to commit suicide with a bomb strapped to their chest or have a box somewhere or something just to â€Å"help their people out† they think it’s a good thing to do even if they don’t succeed in what â€Å"the mission† really was. And terrorism isn’t justShow More RelatedThe Color Of Our Skin Daren1643 Words   |  7 PagesThe color of our skin daren’t portray the lives that we live on a daily basis, that is what’s wrong with today’s society. Prior to 9/11 law enforcement officials had been using racial profiling on a daily basis in their efforts to combat crime. With the attacks on 9/11 an enemy that had previously been invisible became very much a reality. A reality that needed to be dealt with immediately using the only tools that were available at the moment. Just because racial profiling was semi-effective doesn’tRead MoreRacial Profiling Is Unfair, Ineffective, And Dangerous1467 Words   |  6 PagesIn today’s society we face racial profiling every day. People judge people based on the color of their skin. A store owner assumes it’s a Hispanic that stole. A police officer pulls over mainly African Americans for traffic violations. These are all examples of racial profiling. According to oxford dictionaries.com the definition of racial profiling is: The use of race or ethnicity as grounds for suspecting someone of having committed an offense. Racial profiling is wrong because it is unfair, ineffectiveRead MoreRacism : Racism And Prejudice1339 Words   |  6 PagesRacism to Prejudice Racism plays a substantial part in our nations history; from slavery in the seventeenth century through the nineteenth century, to segregation in the early 1900s. The extreme racism of those days are long gone, and continue to just be a memory of the past. Although, prejudice still exists and it always will, because our brains are hardwired to prefer one race to another. That being said; a white person that grows up in an all white neighborhood who also attends an all-white schoolRead MoreStereotypes And Caricatures : The Film Ethnic Notions1445 Words   |  6 Pagesfrustrations out on him. In the movie she has a hard time finding a partner as a result of her behavior (Ngobili). To sum it all up stereotype roles played by black women on television shows you how generalized and acceptable sapphire has become in our society. This indicates the huge responsibilities the media has. Throughout this class there have many several incidences which the white majority have shown their dominance over black minority. Whites have always considered African American as savages The Color Of Our Skin Daren - 1643 Words The color of our skin daren’t portray the lives that we live on a daily basis, that is what’s wrong with today’s society. Prior to 9/11 law enforcement officials had been using racial profiling on a daily basis in their efforts to combat crime. With the attacks on 9/11 an enemy that had previously been invisible became very much a reality. A reality that needed to be dealt with immediately using the only tools that were available at the moment. Just because racial profiling was semi-effective doesn’t make it right. It’s what makes it wrong. What is terrorism? Terrorism is the use of threat or violence especially as a means of forcing others to do what one wishes. Terrorism is real and it comes in many forms, even you may not know it is†¦show more content†¦And with the politicians or military they don’t see it as terrorism they see it as getting the job done no matter what. They’ll target certain race or ethnic groups they think are â€Å"sketchy† have them under surveillance see what they are up to and take every precaution to make sure nothing comes over seas to have any harm or damage done to our country. And it is hard to target certain minority groups sometimes they know exactly what’s going on, so they will set up other groups to make it look like it is their doing but in all reality it is that group that is causing all the trouble. Ways our government has used race or ethnicity to detain individuals. In the interest of National Security after 9/11 the Bush Administration, detained over 700 foreign nationals, not because of their individualized suspicions, but because they were Arabic and or Muslim. Over 5000 persons that had entered the United States legally, 2 years prior to November 2001, that had come into our country legally from countries that had been linked to terrorism. Interviews were being conducted on these 5000 individuals that were targeted because of their race and religious beliefs. The age range predetermined for these 5000 individuals was 18 to 33 years of age, the age of range was set in play November 2001 by order of Attorney General Ashcroft, completely based on ethnicity. In 2003 guidelines were set for the use of race in criminal