Find Jobs
Hire Freelancers

Merge, Normalize, Remove Duplicates - 3 very large data sets (over 1million records each)

$30-250 USD

Zavřený
Zveřejněno před více než 8 roky

$30-250 USD

Zaplaceno při doručení
I have 3 large data sets in Excel. Each file has over 1 million records. The files are similar, but not exactly alike (different columns). I'll provide how to map the columns to for merging the data. I'll also provide how to normalize the data once merged. Lastly I'll need duplicates removed. I'll need the data packaged in Excel files, CSV files, Access files. Data sets will be provided to willing bid. Thanks!
IČ projektu: 8389701

O projektu

25 nabídky
Vzdálený projekt
Aktivní před 9 roky

Chcete si vydělat nějaké peníze?

Výhody podávání nabídek na Freelancer

Stanovte si rozpočet a časový rámec
Získejte za svou práci zaplaceno
Načrtněte svůj návrh
Registrace a podávání nabídek je zdarma
25 freelanceři nabízejí v průměru $144 USD za tuto práci
Avatar uživatele
Coursera Data Science & R Certified Successfully Completed Freelance Project https://www.freelancer.com/jobs/r/programming-Hadoop/ https://www.freelancer.com/jobs/project-7399544/ (Hadoop pig and impala queries) • Applying Analytics using R Programming Language. RHadoop rmr2,rhdfs,rhbase • Applied Time Series Analytics(Arima) for Oil Client for Predecting Oil Production from Oil Plant. DataMining on Facebook , LinkedIn, twitter accounts • 6 years of IT experience in BI & Big Data Hadoop DWH Solutions for Banking , Oil & Gas domain. Data Streaming expertise using Apache Kafka, apache hadoop, apache spark, apache storm, exp on Big Data ,Hapdoop & R ,Apache Suites like Solr , HIVE , HBASE , CouchDB ,MongoDB,Redius,Neo4j, Kafka integration on Hadoop and Ubuntu • Data Mining LinkedIn, Facebook, Twitter • Expert in ETL Tools such as Informatica , SAP BODS , Pentaho, Talend • Excellent analytical and programming skills(Java / Python /C++) with a good understanding at the conceptual level and possess excellent presentation, interpersonal skills with a strong desire to achieve specified goals along with excellent communication skills. • Building Information Views ,Stored Procedures, Triggers, Materialized Views, Cursors, Partitioning, Exception handling, Optimization on DB likes Oracle SAP HANA. • Expertise in Software Development by applying SDLC practices
$111 USD v 3 dnech
4,9 (5 recenze)
4,7
4,7
Avatar uživatele
Hi I have more than 3 years of experience of developing big data application. I have developed many projects using hadoop, hive, pig, zookeeper open source technologies. Please elaborate your job so that I will finalize the bid amount and time. I am open to meet your time line and budget.
$250 USD v 3 dnech
5,0 (2 recenze)
2,7
2,7
Avatar uživatele
Dear Sir, Greetings from RaajVeer! I understand your job and ready to start immediately on your terms and on your budget. I have experience working on excel and normalization of data using functions and macros; I gain this experience working on so many email marketing project and email marketing list. I would like to invite you for a quick discussion. Sharing View and feedback on the same is highly appreciated. Awaiting For your response. Thanks, RaajVeer S. Tomar, Quick Search The Web Dominators
$147 USD v 3 dnech
5,0 (1 recenze)
2,1
2,1
Avatar uživatele
Hello, I have experience with processing large files (tens of gigabytes) and managing large databases (over 1 TB). I also have experience with tabular data formats (CSV, XLS, ODS). I usually parse and process files using Python. Can you give me some samples or the whole files so I can get a better idea of what is required? Best wishes, Ionut
$150 USD v 3 dnech
5,0 (1 recenze)
1,1
1,1
Avatar uživatele
I am professional database developer who has a certified degree in Computer Sciences. I have been dealing ETL process of huge size of databases since the start of my career.
$140 USD v 3 dnech
5,0 (1 recenze)
1,1
1,1
Avatar uživatele
Hi there! Have done this type of work before! Please feel free to ping me with additional information or with questions if any! Thanks! -Steve
$125 USD v 2 dnech
0,0 (0 recenze)
1,1
1,1
Avatar uživatele
I can do that. I mean you will be telling me which column are identical in the files and how you wanna to normalize them. I can extract the duplicate entries and give you the unique entry file in any format you want.
$200 USD v 3 dnech
0,0 (0 recenze)
0,0
0,0
Avatar uživatele
I was working with veyr big files (csv, excel) and I have worked to normalize this type of files, I can send the file wherever you want
$155 USD v 3 dnech
0,0 (0 recenze)
0,0
0,0
Avatar uživatele
Hi, I have more than 14 years of VBA/Excel exp and I am expert in this kind of work. I have just completed something similar a month ago. I have completed more than 250 projects. Please look at the feedback left by my employers to know more about my work. Waiting for your positive response. Thanks.
$100 USD v 3 dnech
0,0 (0 recenze)
0,0
0,0
Avatar uživatele
I am currently employed as a Sr. Quality Analyst for a large mining company in North America. I currently administrate production and cost databases for my company as well as create and administrate my own databases for my company. I am well versed in manipulation of very large datasets from different databases and spreadsheets.
$111 USD v 4 dnech
0,0 (0 recenze)
0,0
0,0
Avatar uživatele
I have the expertise in handling large amounts of data in Excel by using scripts for merging and removing duplicates based on the conditions that you provide. I have experience of handling such data in the past. Ready to start work immediately
$155 USD v 3 dnech
0,0 (0 recenze)
0,0
0,0
Avatar uživatele
I have experience relevant to this such as data migration, use ETL tools for data integration, manipulation,dump to file formats such as CSV,txt,excel. Happy to discuss.
$222 USD v 10 dnech
0,0 (0 recenze)
0,0
0,0
Avatar uživatele
Hi, I have almost 8 years of oracle pl/sql development experience .I worked as developer in Nucleus S/W Exports ltd company in Noida, India where i have worked on projects in banking domain (for SCB , ABN AMRO, Bank of Bahrain) wherein I dealt with export and import of huge files using different methods - sql loader , bulk insert, external tables . Using my previous experience , I can certainly work on your project requirement. Please share further details about the project. Rgds Purnima Chopra
$111 USD v 5 dnech
0,0 (0 recenze)
0,0
0,0
Avatar uživatele
I work with a database that gets a billion new records every day so I do not expect to find problems with tor data sets. What I plan to do is to load the records in an Oracle database, male there the normalization and merge and then download the results in csv format, using Microsoft tools to get tor other desired formats. I expect normalization and merge rules to be at most of medium complexity and very clear from your side. If the dataset includes special characters price will increase 20e and if you aparece me the Access formar William decrease 10e. Do not hesitate to contact me for further details.
$111 USD v 4 dnech
0,0 (0 recenze)
0,0
0,0
Avatar uživatele
Lets start. Having a team of Professionals. We provide high quality work with accuracy. Would you like to discuss more about your current requirements?
$100 USD v 3 dnech
0,0 (0 recenze)
0,0
0,0
Avatar uživatele
Hi. I have 2 years of BigData expirience and I worked with different frameworks and technologies in this branch. Also I had a lot of data transformation tasks during work on porjects. I will do this job just for minimal costs in order to rate my profile, because I don't have any reviews on such projects on this site. Also, please, clarify me the point "I'll need the data packaged in Excel files, CSV files, Access files.". Do I need to save result in all of this formats or just in one of them?
$30 USD v 1 dni
0,0 (0 recenze)
0,0
0,0
Avatar uživatele
I'm graduated system analyst and own over 4 years of experience in the area. I work focusing on quality and customers, finding the best solutions, always.
$166 USD v 5 dnech
0,0 (0 recenze)
0,0
0,0
Avatar uživatele
I work with large sheets like this a lot for my regular job. I was just merging a 2 million row sheet and a 1 million row sheet the other day. Removing dups is easy. Based on what I know now, should be pretty straight forward.
$111 USD v 2 dnech
0,0 (0 recenze)
0,0
0,0
Avatar uživatele
Hello, I have access to few very powerful servers, I think that the best option would be to unzip this Excel and work on file level, creating instances to work on some part of this file, create normalized sets and merge them, then sort by each column and remove duplicates. But I would have to see this sets to know what I'm working on. I'm not scared of number of records but number of columns...
$147 USD v 3 dnech
0,0 (0 recenze)
0,0
0,0
Avatar uživatele
I am a software engineer with more than 10 years of experience in the industry. I am new one on this site, and I need first projects, so this could be a great opportunity for both of us. I am very proficient in databases and with the Excel, but Big Data is my passion. For this kind of task I usually use R programming language which gives me easiness of data manipulation, and a variety of outputs.
$120 USD v 3 dnech
0,0 (0 recenze)
0,0
0,0

O klientovi

Pochází z UNITED STATES
Sarasota, United States
5,0
280
Ověřená platební metoda
Členem od bře 22, 2007

Ověření klienta

Díky! Poslali jsme vám e-mailem odkaz pro získání kreditu zdarma.
Při odesílání e-mailu se něco pokazilo. Zkuste to prosím znovu.
Registrovaných uživatelů Zveřejněných projektů
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Načítání náhledu
Bylo uděleno povolení ke geolokaci.
Vaše doba přihlášení vypršela a byli jste odhlášeni. Přihlaste se znovu.