Find Jobs
Hire Freelancers

Improve webpage scrapping solution

$30-250 USD

Zavřený
Zveřejněno před více než 3 roky

$30-250 USD

Zaplaceno při doručení
I have a Java program to scrap information from a website. The architecture of the solution involves: 1) using Java Selenium to send requests to the webpage via Chrome Webdriver to trigger authentication and authenticated requests; 2) routing the requests from Chrome (headless) to Java BrowserMobProxy to capture three HTTP headers (Authorization, X-CSRF-TOKEN, and Cookie) and one query string; and 3) use these 4 elements in HTTPs requests from Java directly to the webpage (i.e. without Selenium, Chrome, and BrowserMobProxy involved) to retrieve the desired information. This program does the basic functionality of extracting the information but has a few problems: It depends on an external non-Java component: Chrome WebDriver It depends on Java Selenium and Java BrowserMobProxy, two dependencies that I would like to remove It is not optimized (too much refresh and too long sleep periods) relatively to the limit upon which the Webpage (Cloudfare) starts responding 429 errors. Thus, the retrieval of the information is taking much more time than needed. Deliverables You will get the current program Java code and you will need to solve the problems above. To do so, you will need to: A. Find out how to authenticate and refresh the 3 headers and the query string without depending on Selenium, Chrome Webdriver, and BrowserMobProxy. As most of this data is likely generated in JavaScript, you will need knowledge about JavaScript and how to execute JavaScript from within Java or convert the JavaScript code to Java (preferable solution). B. You will need to identify the limit upon which the Webpage (behind Cloudfare) starts responding 429 errors. You will need to tune the refresh frequency of the headers and sleep periods to the limit identified. You will need to demonstrate the benefits of your changes by extracting the information currently extracted by the program and measuring how long it takes. Note: you will need to create your own login/password in the webpage. No additional requirements exist to register.
IČ projektu: 26781086

O projektu

11 nabídky
Vzdálený projekt
Aktivní před 4 roky

Chcete si vydělat nějaké peníze?

Výhody podávání nabídek na Freelancer

Stanovte si rozpočet a časový rámec
Získejte za svou práci zaplaceno
Načrtněte svůj návrh
Registrace a podávání nabídek je zdarma
11 freelanceři nabízejí v průměru $171 USD za tuto práci
Avatar uživatele
Hi, Greetings! ✅checked your project details: Improve webpage scrapping solution ✅Completed Time: In project deadline We have worked on 600 + Projects. I have 6 + years of the experience in same kind of projects. If you are looking for a true Freelancer, I am the Right person for you. I am available almost 24-7 and am very responsive. I feel proud that I am a trusted Freelancer who pleases almost every single client. You can rest assure, your work will be delivered well in advance of others, with passion and accuracy. I guarantee you instant communication & responses when you need me. Why choose me? I think every client is the reason for my success. I only take projects which I am sure I can do quickly. My Portfolio Items: https://www.freelancer.com/u/schoudhary1553 I would really like to work with you on this project. If interested, Kindly contact me via chat for further details and discussion.. Thank you Sandeep Digital screencast
$220 USD v 4 dnech
5,0 (144 recenze)
7,4
7,4
Avatar uživatele
NOTE : I HAVE EXPERTISE IN JAVA AND JAVASCRIPT. I CAN COMPLETE THIS TASK IN QUICK TIME. With respect to this project I would like to present myself as a candidate for your consideration. I have more than 12 years of IT experience. I have successfully completed projects which involved Programming, scripting. Key features of my work are as follows: 1) Always within schedule 2) Always within scope 3) Best tools and techniques used for the completion of tasks Looking forward to hear from you. Regards
$140 USD v 4 dnech
4,9 (42 recenze)
5,7
5,7
Avatar uživatele
Hello man. How are you today? I am very interested in your project, so I have prepared a bid. I know you need significant help. I sincerely want to help you. Of course, I know that to help you, you need to have experience with concepts and practices. As a developer for a long time, I have been practicing my skills and tech. So I'm sure I can help you as perfectly When you contact, our conversation will be more interesting. I will wait for your kind reply.
$100 USD v 1 dni
5,0 (17 recenze)
5,2
5,2
Avatar uživatele
Dear Employer, I have read the project details and confident to work on improving web page scraping. I have extensive knowledge on Java, javascript,python, software architecture etc . Kindly message me so that we can discuss more about the work. Regards Lucky
$222 USD v 3 dnech
5,0 (36 recenze)
5,2
5,2
Avatar uživatele
Greetings Sir, I am Muhammad Faisal and i am a professional Java developer having almost 4 years of experience and we provide you quality work within your time and budget so, lets get started ? Thanks
$140 USD v 7 dnech
4,9 (23 recenze)
3,7
3,7
Avatar uživatele
Hi, I am an expert developer who has done extensive web scrapping work only with Java. I can remove all dependencies on Chrome and remove any sort of sleeps. The way i'd do it is to only wait until the DOM is rendered or the specific elements are ready. Can get this done while keeping the code clean and easy to read.
$250 USD v 7 dnech
4,8 (2 recenze)
3,3
3,3
Avatar uživatele
Hello,sir.. How are you?... I am very interested in your project. I have rich experiences in scraping website and processing data within selenium package. Have experiences in scrapers used chrome extension and proxy server. It is needed when scraper login in website and get enough data. After get all data, to act what you want is not problem for me. I have top skills of automated scraping and processing data what you want. I can create the scraper with PYTHON, PHP or C#. Will be provided the professional solution. If you hire me, I will work for you until you are satisfied and complete the project perfectly. Will wait for your reply in full time. Thank you.. Best regard..
$140 USD v 7 dnech
5,0 (4 recenze)
2,5
2,5
Avatar uživatele
i have optimized a lot of distributed crawling projects and Selenium based crawlers ... please connect me over chat and share me your code. i will first identify all the pin points and alternatives for your authentication and authorization libraries .. believe me I will make it much simple and regarding delay and timeouts i need to look code.. please share me your source code.
$100 USD v 1 dni
4,4 (1 recenze)
1,7
1,7

O klientovi

Pochází z ROMANIA
Băilești, Romania
5,0
1
Členem od bře 8, 2020

Ověření klienta

Díky! Poslali jsme vám e-mailem odkaz pro získání kreditu zdarma.
Při odesílání e-mailu se něco pokazilo. Zkuste to prosím znovu.
Registrovaných uživatelů Zveřejněných projektů
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Načítání náhledu
Bylo uděleno povolení ke geolokaci.
Vaše doba přihlášení vypršela a byli jste odhlášeni. Přihlaste se znovu.