-
Notifications
You must be signed in to change notification settings - Fork 769
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Il Post #3240
base: master
Are you sure you want to change the base?
Il Post #3240
Conversation
return 'multiple'; | ||
} | ||
else { | ||
return 'newspaperArticle'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check that it's actually an article - we don't want to capture the home page as a newspaperArticle
, for instance.
// TODO | ||
// return 'blogPost'; | ||
} | ||
else if (url.includes('/cerca/')) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
else if (url.includes('/cerca/')) { | |
else if (url.includes('/cerca/') && getSearchResults(doc, true)) { |
async function doWeb(doc, url) { | ||
await doWebInternal(doc, url, true); | ||
} | ||
|
||
async function doWebInternal(doc, url, includeSearch) { | ||
switch (detectWeb(doc, url)) { | ||
case 'newspaperArticle': | ||
await scrapeArticle(doc, url); | ||
break; | ||
case 'multiple': | ||
if (!includeSearch) return; | ||
let searchResults = getSearchResults(doc, false); | ||
if (!searchResults) return; | ||
let items = await Zotero.selectItems(getSearchResults(doc, false)); | ||
if (!items) return; | ||
for (let url of Object.keys(items)) { | ||
await doWebInternal(await requestDocument(url), url, false); | ||
} | ||
break; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can just move the contents of doWebInternal()
to doWeb()
and call scrapeArticle()
directly in the multiple
handler.
} | ||
} | ||
|
||
const ISSN = '2610-9980'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can inline this
let sections = []; | ||
for (let taxonomy of taxonomyData) { | ||
if (!taxonomy.name) continue; | ||
switch (taxonomy?.taxonomy) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
switch (taxonomy?.taxonomy) { | |
switch (taxonomy.taxonomy) { |
Not much good in using a short-circuiting operator when we've already accessed taxonomy.name
above.
break; | ||
} | ||
} | ||
if (sections.length > 0) item.section = sections.sort().join(', '); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If they're tags, we should be putting them in item.tags
. I don't think there's often a good reason to put multiple things into item.section
.
Translator for Italian newspaper Il Post. Supports articles and search pages. Recognizes but does not support podcasts and newsletters yet