Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add IPinfo IP-to-country crawler #151

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

maxmouchet
Copy link

@maxmouchet maxmouchet commented Nov 5, 2024

Description

This adds a crawler for IPinfo's free IP-to-country database.

Motivation and Context

Brief discussion in #150.

There are a few challenges related to this data source:

  • There's a relatively large number of prefixes: ~1.5M IPv4 and ~3.4M IPv6.
    • Furthermore these prefixes are not necessarily stable over time. They can change from one day to another as the geolocation is updated.
    • Is this an issue for IYP/Neo4j?
  • The data is licensed under a CC BY-SA license, but a token is required to download the file.
    • This token can be obtained by creating a free ipinfo.io account. I thought it might be best for someone in the core IYP team to own this account?
    • The token is passed through ipinfo.token in config.json.
  • Meaning of a prefix?
    • Should there be different prefix types?
    • IPinfo's prefixes represent consecutive IP addresses with the same geolocation, but they do not necessarily match BGP or WHOIS ranges.

How Has This Been Tested?

Basic queries. More testing to be done.

Screenshots (if appropriate):

MATCH (prefix:Prefix {prefix:'1.0.0.0/24'})-[{reference_name:'ipinfo.ip_country'}]-(cc:Country)
RETURN prefix, cc
Screenshot 2024-11-05 at 18 58 01

Note that one of the two links is from nro.delegated_stats. Not sure why it's showing up when matching on {reference_name:'ipinfo.ip_country'} (not familiar with Cypher yet 😅)?

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.

@maxmouchet maxmouchet force-pushed the feat/add-ipinfo-ip-to-country branch from 7860d6b to 894bdc4 Compare November 6, 2024 08:55
@m-appel m-appel self-requested a review November 6, 2024 09:56
@m-appel
Copy link
Member

m-appel commented Nov 6, 2024

Hi, thanks for contributing! I will take a closer look at this soon, but at an initial glance this already seems pretty good.

Note that one of the two links is from nro.delegated_stats. Not sure why it's showing up when matching on {reference_name:'ipinfo.ip_country'} (not familiar with Cypher yet 😅)?

This is a side effect of the neo4j browser. You can disable this by going to the settings on the bottom left and uncheck "Connect result nodes". Then you'll only get what you asked for.

image

As Romain mentioned in the discussion, this is blocked by #103 but we will get on that soon™

@maxmouchet
Copy link
Author

Awesome, thanks for the hint! No worries, let me know if you need me to do anything :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants