Reg. that constitute big data, develop conceptual big

Reg. No.: 15MIS0014Name: K.C.RohithQuestion:By identifying various data sources that constitute big data, develop conceptual big data analytics architecture for an Insurance applications and system.Answer:Big Data technologies are used comprehensively to determine risk, claims and enhance customer experience, allowing insurance companies to achieve higher predictive accuracy. Let’s take a look at the major uses of big data and its technologies in the insurance industry.1. Risk Assessment By using predictive modeling, the insurers can identify whether the drivers are likely to be involved in an accident, or have their car stolen, by combining their behavioral data with the factors such as road conditions or safe neighborhoods.A similar use can be seen in the world of health and life insurance due to the growing use of wearable technology. Activity trackers can monitor users behaviors and habits and provide ongoing assessments of their activity levels.2. Fighting Against FraudsUnfortunately the number of counterfeit claims has increased in the insurance industry and it is important to reduce the number of fraud claims as it protects the companies from losses. Insurance companies are using graph data analytics and other new types of analytics to become aware of applicants suspicious behavior who have a higher tendency to commit fraud. 3. Effective MarketingBig Data analytics allows insurance companies to use digital media to reach out more number of potential customers through effective marketing campaigns. Insurers are using data insights to identify selling opportunities and demands.4. Personalize PoliciesInsurance companies are increasing their customer base by making the right use of fine-grained analytics for offering personalized policies to each of them. Companies get more accurate information about customers through predictive analytics and afterwards offer personalized services or policies to all the customers.5. AutomationInsurers used to automate simple processes such as compliance checks, data entry, or repetitive tasks that require less-initiative taking skills. With the rise of big data technologies, these simple tasks gave way to more complicated skills.6. Smarter Labor and FinanceWith the help of real-time analysis, insurers now can make daily adjustments to premium rates, premium strategies and underwriting limits by combining internal data (policy, regulations) with external data (social media, press, analyst comments) in order to optimize their finances and instant payouts.Data mining techniques are also used to cluster and score claims in order to prioritize and assign them to the most appropriate employee based on their experience on claim complexity.  This saves insurers a significant amount of labor-time and prevents them from high settlement amounts.Architecture: Along with the server clusters provisioned in company data centers or leased from virtual cloud environments, the software to deploy and manage Big Data application environments is a crucial element. The complexity of deploying “Big Data Infrastructure” clusters has been somewhat lessened by a new generation of open-source software frameworks. The leading solution is Apache Hadoop, which has gained great popularity due to its maturity, ease of scaling, affordability as a non-proprietary data platform, ability to handle both structured and unstructured data, and many connector products.Combining the leading Apache Hadoop software stack with the leading cluster management solution, StackIQ has engineered a revolutionary new solution that makes Hadoop deployments of all sizes much faster, less costly, more reliable, and more flexible. StackIQ Enterprise Data optimizes and automates the deployment and management of underlying cluster infrastructures of any size while also providing a massively scalable, open source Hadoop platform for storing, processing, and analyzing large data volumes.StackIQ Enterprise DataStackIQ Enterprise Data is a complete, integrated Hadoop solution for enterprise customers. For the first time, enterprises get everything they need to deploy and manage Hadoop clusters throughout the entire operational lifecycle in one product.StackIQ Enterprise Data includes:Hortonworks Data Platform Horton Data Platform powered by Apache Hadoop is an open-source, massively scalable, highly stable and extensible platform based on the most popular and essential Hadoop projects for storing, processing, and analyzing large volumes of structured and unstructured data. Hortonworks Data Platform platform makes it easier than ever to integrate Apache Hadoop into existing data architectures. Highly recommended for anyone who has encountered difficulties installing and integrating Hadoop projects downloaded directly from Apache, Hortonworks Data Platform is also ideal for solution providers wanting to integrate or extend their solutions for Apache Hadoop. The platform includes HDFS, MapReduce, Pig, Hive, HBase, and Zookeeper, along with open source technologies that make the Hadoop platform more manageable, open, and extensible. These include HCatalog, a metadata management service for simplifying data sharing between Hadoop and other enterprise information systems. Hortonworks has contributed more than 80% of the code in Apache Hadoop to date and is the main driving force behind the next generation of the software. Hortonworks is the only platform that is completely consistent with the open source version.StackIQ Hadoop Manager StackIQ Hadoop Manager manages the day-to-day operation of the Hadoop software running in the clusters, including configuring, launching, and monitoring HDFS, MapReduce, ZooKeeper, Hbase and Hive. A unified single pane of glass with a command line interface (CLI) or graphical user interface (GUI) is used to control and monitor all of these, as well as manage the infrastructure components in the cluster. Easy to use, the StackIQ Hadoop Manager allows for the deployment of Hadoop clusters of all shapes and sizes. Typically, the installation and management of a Hadoop cluster has required a long, manual process. The end user or deployment team has had to install and configure each component of the software stack by hand, causing the setup time for such systems and the ongoing management to be problematic and time-intensive with security and reliability implications. StackIQ Enterprise Data completely automates the process. StackIQ Cluster Manager StackIQ Cluster Manager manages all of the software that sits between bare metal and a cluster application, such as Hadoop. A dynamic database contains all of the configuration parameters for an entire cluster. This database is used to drive machine configuration, software deployment, management, and monitoring. Regarding specific features, the Cluster Manager: • Provisions and manages the operating system from bare metal, capturing networking information (such as MAC addresses) • Configures host-based network settings throughout the cluster • Captures hardware resource information (such as CPU and memory information) and uses this information to set cluster application parameters • Captures disk information and using this information to programmatically partition disks across the cluster• Installs and configuring a cluster monitoring system • Provides a unified interface (CLI and GUI) to control and monitor all of this. The StackIQ Cluster Manager for Hadoop is based on StackIQ’s open source Linux cluster provisioning and management solution, Rocks. Rocks was initially designed to enable end users to easily, quickly, and cost-effectively build, manage, and scale application clusters for High Performance Computing (HPC). Thousands of environments around the world now use Rocks. In StackIQ Enterprise Data, the Cluster Manager’s capabilities have been expanded to not only handle the underlying infrastructure but to also handle the day-to-day operation of the Hadoop software running in the cluster. By contrast, StackIQ Enterprise Data operates from a continually updated, dynamic database populated with site-specific information on both the underlying cluster infrastructure and running Hadoop services. Key Benefits of StackIQ Enterprise Data • The first complete, integrated, Hadoop solution for the enterprise • Faster time to deployment • Automated, consistent, dependable deployment and management• Simplified operation that can be quickly learned without systems administration experience • Reduced downtime due to configuration errors • Reduced total cost of ownership for Hadoop clusters