You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a simple UI in Tkinter, which fixes several issues, WITHOUT changing the core library. If you are interested, it does show some interesting things you can do with icrawler. Yes it might seem like a mess, but if you are already using icrawler it should be clear. I can write python, and I am learning tkinter, but suggestions are welcome on my Issues list. Most things work and I want to add more.
I forked the whole project in case I needed to do fixes, but the UI is all in /examples/
#98 - keep_file() override in FilenameDownloader checks file type, you can return False if extension != "jpg" #111 - example how to override set_logger() for full control (commented out for me) #108 - get file name (from Content-Disposition or URL) #108 - also log (INFO) image #, filename, URL. You can change the formatting, log to a file, or whatever else you want #117 and #107- log (DEBUG) the Google content if no images are found to help resolve, if it's still a problem #110 - a similar log could be done for Bing. Not implemented, but easily copied (google.py) #106 - a keyword separator option, so you san enter, for example: "beans|rice" and search first "beans" then "rice", separately #103 - google language selection fix should help Baidu, since it adds headers to look more like a web browser and avoid getting flagged. #104 - google language selection should help. Common languages are in GoogleLanguageOptions.py, add to it if you need to #61 - sort of fixed, it creates a directory for each keyword. "rice" goes in storage/rice/, "beans" in storage/beans/ - hopefully it is a good example. #121 - a better, but not perfect, check for disk space errors, in the core library
Also image type detection for #108, finding the correct file extension
Thanks to hellock for the library, I'm just making it easier for me to use!
Have fun!
Patty
The text was updated successfully, but these errors were encountered:
I have a simple UI in
Tkinter
, which fixes several issues, WITHOUT changing the core library. If you are interested, it does show some interesting things you can do with icrawler. Yes it might seem like a mess, but if you are already using icrawler it should be clear. I can write python, and I am learning tkinter, but suggestions are welcome on my Issues list. Most things work and I want to add more.I forked the whole project in case I needed to do fixes, but the UI is all in
/examples/
https://github.com/Patty-OFurniture/icrawler
#98 -
keep_file()
override in FilenameDownloader checks file type, you can return False if extension != "jpg"#111 - example how to override
set_logger()
for full control (commented out for me)#108 - get file name (from Content-Disposition or URL)
#108 - also log (INFO) image #, filename, URL. You can change the formatting, log to a file, or whatever else you want
#117 and #107- log (DEBUG) the Google content if no images are found to help resolve, if it's still a problem
#110 - a similar log could be done for Bing. Not implemented, but easily copied (google.py)
#106 - a keyword separator option, so you san enter, for example: "beans|rice" and search first "beans" then "rice", separately
#103 - google language selection fix should help Baidu, since it adds headers to look more like a web browser and avoid getting flagged.
#104 - google language selection should help. Common languages are in GoogleLanguageOptions.py, add to it if you need to
#61 - sort of fixed, it creates a directory for each keyword. "rice" goes in storage/rice/, "beans" in storage/beans/ - hopefully it is a good example.
#121 - a better, but not perfect, check for disk space errors, in the core library
Also image type detection for #108, finding the correct file extension
Thanks to hellock for the library, I'm just making it easier for me to use!
Have fun!
Patty
The text was updated successfully, but these errors were encountered: