-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add query parameter [Based on #357] #403
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@addie9800 Thanks for adding. Looks like a great feature :)
@@ -18,7 +18,7 @@ | |||
|
|||
def get_test_article(enum: PublisherEnum, url: Optional[str] = None) -> Optional[Article]: | |||
if url is not None: | |||
source = WebSource([url], publisher=enum.publisher_name) | |||
source = WebSource([url], publisher=enum.publisher_name, query_parameters=enum.query_parameter) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is about crawling a specified URL we should not pass the query parameter here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, I'll remove that
src/fundus/scraping/html.py
Outdated
delay: Optional[Delay] = None, | ||
): | ||
self.url_source = url_source | ||
self.publisher = publisher | ||
self.url_filter = url_filter | ||
self.request_header = request_header or _default_header | ||
self.query_parameters = query_parameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would suggest
self.query_parameters = query_parameters or {}
and remove
if self.query_parameters is not None:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good
This PR is a rebrand of and closes #392 where the code is changed to work with #357
The major change compared to the former version ist that the query_parameter attribute accepts a dictionary of key value pairs. Adding Heise will be done in a separate PR