|
jcvergar
|
|
Joined: 17 Jan 2005 Posts: 328 Thu 13 Jul 2006 Location: Oiartzun - Spain
|
HIGH IMPACT of fails in long downloading processes |
|
Hi Bruno,
MP executes the searches in three steps:
1.- MP launches the search request. Also MP takes note of the different associated data and graphics to be downloaded (eg claims, description, mosaics et).
Result => MP gathers the list of patent numbers
2.- MP shows the obtained results (usually year by year) so the user can validate or filter
Result => MP builds all the URLs to download all the data and graphics of validated patent numbers
3.- MP downloads all the links generated
Result => MP builds the database, indexes etc
---
PROBLEM: Sometimes the 3rd step FAILS. Some obvious comments on this:
- fails in short downloading processes (few patents) are very rare
- short downloading processes are easy to prepare again (steps 1-2)
=> so MP runs OK in short downloading processes
- fails in long downloading processes (many patents) are much more frequent
- time necessary to prepare again Steps 1-2 in long downloading processes is not trivial
=> so the REAL IMPACT of failings in long downloading processes (in time dedicated by the user) is much higher than it should be.
SUGGESTION: To STORE all the necessary data after step 2 (patent numbers, data and graphics to be downloaded, etc) in a temporary file, so downloading proceses can be repeated if necessary without steps 1-2.
Regards _________________ Juan Carlos Vergara |
|
|
|
|
Back to top |
|
Mannina Site Admin
|
|
Joined: 06 Jan 2005 Posts: 978 Thu 13 Jul 2006 Location: Marseille
|
|
|
Hi Juan,
I think i will add this option asap because it is a great idea, may be in the next revision.
It is not easy to do that but i think it's possible
Many thank's
Bruno |
|
|
|
|
Back to top |
|
jcvergar
|
|
Joined: 17 Jan 2005 Posts: 328 Mon 23 Oct 2006 Location: Oiartzun - Spain
|
|
|
Hi Bruno,
I'm sorry to say that Espacenet OPS is not answering as well as suppossed. I mean that the downloading process gets blocked too many times.
In fact, I have been ALL this morning trying and trying and trying to download 286 patents ... without success. Finally I have opted to download by years. Ok but tedious.
Problem to surpass: It would be great to have a mechanism to identify which part of the download list has been downloaded and which part has not been downloaded.
I remember you told that it is difficult to know if the downloading of a html page/pdf has finished ok or not
Suggestion: I suppose that you can program MP to check if a list of documents are or not in predetermined subdirectories.
So (if there is an interruption) MP could begin the downloading process just after the last downloaded document.
What do you think? _________________ Juan Carlos Vergara |
|
|
|
|
Back to top |
|
Mannina Site Admin
|
|
Joined: 06 Jan 2005 Posts: 978 Mon 23 Oct 2006 Location: Marseille
|
|
|
Hi Juan,
OPS and Espacenet has annonced that their server will be in maintenance this W.E. May be it was always in maintenance when you have used MP.
I try this search :
TI/AB : plastic and bicycle
No date range
No option
==> 500 patents downloaded in 5 minutes and 16 seconds
==> ~1400 links without error.
Meanwhile, i try to add a function that save patent number before lauching step 2 (just after Pre-Selecting).
I must work yet two or three days on Matheo Web and after i will work on Matheo Patent 8.0.
Regards,
Bruno |
|
|
|
|
Back to top |
|
jcvergar
|
|
Joined: 17 Jan 2005 Posts: 328 Mon 23 Oct 2006 Location: Oiartzun - Spain
|
|
|
Many thanks, I appreciate your efforts! _________________ Juan Carlos Vergara |
|
|
|
|
Back to top |
|
Mannina Site Admin
|
|
Joined: 06 Jan 2005 Posts: 978 Mon 23 Oct 2006 Location: Marseille
|
|
|
Juan,
You know that you are one of my favorite user !!!
Amitiés,
Bruno |
|
|
|
|
Back to top |
|
jcvergar
|
|
Joined: 17 Jan 2005 Posts: 328 Thu 02 Nov 2006 Location: Oiartzun - Spain
|
|
|
Hi Bruno, I know that I am quite persistent in this topic, but it is a real problem and also you did not answer to my las proposal
Quote: | Suggestion: I suppose that you can program MP to check if a list of documents are or not in predetermined subdirectories.
So (if there is an interruption) MP could begin the downloading process just after the last downloaded document.
What do you think? |
Today the situation is better. I have the LAST MP version which has a utility to save the list of patent numbers to be downloaded. This is A VERY GOOD UTILITY to avoid repeating a Search question if the download has been blocked. But the problem is not completely solved.
I mean, if the patent list is long (eg more than 1000 numbers) my experience says that ... most of times the downloading process will get blocked before ending. At this moment the user has no information about which patents have been downloaded and which ones are pending to be downloaded.
Now I see TWO options :
1.- MP "reads" the local directory where patents are stored before beginning the downloading process. So the user asks MP to download the same complete list but MP works just with the not-already-downloaded patents.
2.- MP "takes note" of all the patent list before beginning the downloading process. I mean, MP loads to the database the patent numbers and shows the patents of the list with a "red square", as documents pending to be downloaded. If the downloading process gets blocked, only part of the list has been converted to "green square". So the user only has to select the already "red square" patents and ask MP to dowload them again.
I prefer this last option because:
- it is ok to assign a "red square" to the patents because it is clear that all the patents of the list have to be downloaded soon or later.
- in case of any problem the downloading situation is much more "evident" to the user.
regards _________________ Juan Carlos Vergara |
|
|
|
|
Back to top |
|
Mannina Site Admin
|
|
Joined: 06 Jan 2005 Posts: 978 Thu 02 Nov 2006 Location: Marseille
|
|
|
Hi Juan, sorry
Well it's done now. In fact i have done your both option mixed
When you do add patent, i test :
Code: | if (Assigned(PatentNumber) And (Patent.Downloaded)) or
FileExists(Path+'\'+cPathXML+'\'+PatentNumber+'.xml') then don't process |
Available in the next revision, or get only the mpatent8.exe in
http://www.matheo-software.com/tmp/mpatent8.exe (rev.061101)
Bruno |
|
|
|
|
Back to top |
|
jcvergar
|
|
Joined: 17 Jan 2005 Posts: 328 Thu 02 Nov 2006 Location: Oiartzun - Spain
|
|
|
Hi Bruno, downloaded and running.
I see that MP now:
- it doesn't consume CPU resources with tasks already done
- it doesn't disturb Espacenet with repeated queries
- it doesn't saturate lines with repeated traffic
So MP now is much more intelligent and much more SUSTAINABLE than before!
AND it disturbs me less time with downloading tasks
... anyway I thinkd that we will continue speaking about downloading ...
Many thanks!! _________________ Juan Carlos Vergara |
|
|
|
|
Back to top |
|
Mannina Site Admin
|
|
Joined: 06 Jan 2005 Posts: 978 Thu 02 Nov 2006 Location: Marseille
|
|
|
Code: | ... anyway I thinkd that we will continue speaking about downloading ... |
while Matheo Software will not have his own patents database
Thank's
Bruno |
|
|
|
|
Back to top |
|
jcvergar
|
|
Joined: 17 Jan 2005 Posts: 328 Fri 03 Nov 2006 Location: Oiartzun - Spain
|
|
|
Hi Bruno,
1.- Could you implement this improvement in ANY kind of downloading?
I mean, If I select a group of patents and I do right-click > download > Mosaic/Drawing => I only like to download the PENDING Mosaic/Drawings not-already-downloaded.
The same apply to right-click > download > First Page
The same apply to right-click > download > Description
The same apply to right-click > download > Claims
The same apply to right-click > download > All Cited Documents
There exception is right-click > download > Legal Status because here it is necessary to FORCE the download and REFRESH the contents of legal status.
---
2.- Sorry but "Download all except complete Documents" ... does nothing at least in my computer. could you please check?
Many thanks _________________ Juan Carlos Vergara |
|
|
|
|
Back to top |
|
Mannina Site Admin
|
|
Joined: 06 Jan 2005 Posts: 978 Fri 03 Nov 2006 Location: Marseille
|
|
|
Quote: | 2.- Sorry but "Download all except complete Documents" ... does nothing at least in my computer. could you please check? |
in fact i know what it's because you use it one "Green Square" it works only on "Red Square" patent, i corrected this problem.
You can get the latest rev. only mpatent8.exe on :
http://www.matheo-software.com/tmp/mpatent8.exe (rev.061103)
Regards,
Bruno |
|
|
|
|
Back to top |
|