Postingan

BIG DATA ASSIGNMENT 7

Gambar
The Most People that i Always contact with It will visualize the most people that i always contact in last 2 weeks. I use SNA method to do this visualize. Before i show you the visualize i will  explain a little bit about SNA. WHAT IS SNA or Social Network Analysis? Social Network Analysis (SNA) is the proccess of investigating social structure through the use of networks and graph theory. It characterizes networked structures in terms of nodes  (individual actors,people, or things within the network)  and the ties, edges or links (relationship or interactions) that connect them. Tools : Gephi This is nodes 9 of the contact This is the Edges : This is the overview  This is the Modularity class This is the authority This is Betweenes Centrality This is Closeness Centrality This is Degree centrality This is eccentricity
Gambar
CLASSIFICATION METHOD (ASSIGNMENT 6) Data mining There are several major  data mining  techniques   have been developing and using in data mining projects recently including  association ,  classification ,  clustering ,  prediction,   sequential patterns  and  decision tree . We will briefly examine those data mining techniques in the following sections. Classification Classification is a classic data mining technique based on machine learning. Basically, classification is used to classify each item in a set of data into one of a predefined set of classes or groups. Classification method makes use of mathematical techniques such as decision trees, linear programming, neural network and statistics. Clustering Clustering is a data mining technique that makes a meaningful or useful cluster of objects which have similar characteristics using the automatic technique. The clustering technique defines the classes and put...
Gambar
EREADER SCORING AND EREADER TRANING ANALYSIS USING RAPIDMINER (ASSIGNMENT 5) I want to try to make decision tree of ereader scoring and training analysis using rapidminer. 1. Add the data ereader scoring.csv to rapidminer 2. Add the data ereader training.csv to rapidminer 3. After we add data above rapidminer will show this 4.  Click Design beside result 5. Drag Scoring and Training 6.  Make 2 Set Role operators to both your training and scoring streams. In the Parameters area on the right hand side of the screen, set the role of the   User_ID  attribute to  id . And then make the another set role for Training Streams and set the role of the   eReader_Adoption  attribute to  label . 7.  Next, search in the Operators tab for  Decision Tree . Select the basic  Decision Tree operator and add it to your training stream. 8.  And then drag the  Apply Model O...
Gambar
PREDICTION MODEL USING ORANGE Assignment 4 I try to make Decision Tree, Naive bayes, KNN with data Pemilu and make it in Orange application 1. Decision Tree 2. Naive Bayes 3. KNN Conclusionn: I conclude that systemathic from this three (KNN, Naive Bayes, and Decision tree) first, i use Decision tree's tools to compare the data than Naive Bayes. After that measured by AUC (Area Under the receiver operating Characteristic curve) at the end for the final we can see that KNN algorithms can show verywell for the result of Data Pemilu. Than, decision tree's that we can see too many difficullty to read it and than of course not for naive bayes, because naive bayes show us uncompletly result.

Data Visualization

Gambar
Data visualization  Assignment 3 Data visualization or data visualisation is viewed by many disciplines as a modern equivalent of visual communication. It involves the creation and study of the visual representation of data, meaning "information that has been abstracted in some schematic form, including attributes or variables for the units of information". A primary goal of data visualization is to communicate information clearly and efficiently via statistical graphics, plots and information graphics. Numerical data may be encoded using dots, lines, or bars, to visually communicate a quantitative message. Effective visualization helps users analyze and reason about data and evidence. It makes complex data more accessible, understandable and usable. Users may have particular analytical tasks, such as making comparisons or understanding causality, and the design principle of the graphic (i.e., showing comparisons or showing causality) follows the task. Tables are gene...

BIG DATA

Gambar
Survey on clinical prediction models for diabetes prediction Assignment 1 & 2 Introduction Predictive analytics use statistical or machine learning method to make a prediction about future or unknown outcomes [ 1 ]. It uses text mining for unstructured data, answers the question “what is next step?” It uses historical and present data to predict future regarding activity, behaviour and trends. To do this it makes use of statistical analysis techniques, analytical queries and automated machine learning algorithms. Predictive analytics need experts to build predictive models. These models are used for prediction. There are many applications of predictive analytics, out of which one is health care. A most common disease now a day’s is diabetes. People are suffering with it and the patient number increases day by day. The World Health Organization (WHO) predicts that by 2030 there will be approximately 350 million people worldwide affected by diabetes [ 2 ,  3 ]. Mo...