Skip to content

Commit

Permalink
Merge pull request #27 from Sgiath/add-7-dot-domains
Browse files Browse the repository at this point in the history
Add support for domains with 7 dots
  • Loading branch information
Zensavona authored Jan 29, 2024
2 parents 309e8c3 + 7fb1502 commit 16f16a8
Show file tree
Hide file tree
Showing 14 changed files with 237 additions and 249 deletions.
3 changes: 3 additions & 0 deletions .formatter.exs
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[
inputs: ["*.exs", "{config,lib,test}/**/*.{ex,exs}"]
]
1 change: 0 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@
/deps

# Where 3rd-party dependencies like ExDoc output generated docs.
/docs
/doc

# Ignore .fetch files in case you like to edit your project deps locally.
Expand Down
4 changes: 2 additions & 2 deletions .tool-versions
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
erlang 26.1.1
elixir 1.15.6-otp-26
erlang 26.2.1
elixir 1.16.0-otp-26
10 changes: 0 additions & 10 deletions .travis.yml

This file was deleted.

67 changes: 67 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# Changelog

## 3.0.4

- Fix issue with new, even longer domains from public_suffix_list.dat @Sgiath

## 3.0.3

- Fix issue with new, longer domains from public_suffix_list.dat @fabiokr

## 3.0.1

- Resolve warnings about SSL and `Mix.Config` being deprecated.

## 3.0.0

- Breaking change: default to including private domains. `:include_private == false` is still
respected (but defaults to false), and a new env var `:icann_only` is added and defaults to
false.

## 2.4.0

- Support disabling compile time http request with `:fetch_latest` config (thanks @s3cur3 for
the PR!)

## 2.3.0

- Bump deps

## 2.2.0

- Use `Logger` for logging

## 2.1.4

- Pin a version of `nimble_parsec` to fix a compilation error on `makeup` (`makeup` has fixed
this downstream, so when `ex_doc` updates `makeup`, this will no longer be required)

## 2.1.3

- Merge a couple of minor PRs

## 2.1.2

- Improve tests and docs slightly

## 2.1.1

- Privatize `Domainatrex.match/n` and `Domainatrex.format_response/2` as they are only ever
intended for internal use

## 2.1.0

- Better handle private domains. Private domains like `*.s3.amazonaws.com` are technically
classed as TLDs (to my understanding?), it doesn't make a lot of sense to parse them this way.
- Fetch a new copy of the public suffix list from The Internet on compile, falling back to a
(now updated!) local copy.

## 2.0.0

- Change the API from returning explicit results to {:ok, result} or {:error, result}. This is to
be more uniform with other libraries I use and for better `with` usage. Sorry if this fucks up
your day.

## 1.0.1

- Fully update the tests to reflect changes in `2.0.0` (thanks for the PR @pbonney!)
84 changes: 23 additions & 61 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,30 +1,29 @@
# Domainatrex

> Domainatrex is a TLD parsing library for Elixir, using the Public Suffix list
### Domainatrex is a TLD parsing library for Elixir, using the Public Suffix list.

[![Build Status](https://travis-ci.org/Zensavona/domainatrex.svg?branch=master)](https://travis-ci.org/Zensavona/domainatrex) [![Inline docs](http://inch-ci.org/github/zensavona/domainatrex.svg)](http://inch-ci.org/github/zensavona/domainatrex) [![Coverage Status](https://coveralls.io/repos/Zensavona/domainatrex/badge.svg?branch=master&service=github)](https://coveralls.io/github/Zensavona/domainatrex?branch=master) [![Deps Status](https://beta.hexfaktor.org/badge/all/github/Zensavona/domainatrex.svg)](https://beta.hexfaktor.org/github/Zensavona/domainatrex) [![hex.pm version](https://img.shields.io/hexpm/v/domainatrex.svg)](https://hex.pm/packages/domainatrex) [![hex.pm downloads](https://img.shields.io/hexpm/dt/domainatrex.svg)](https://hex.pm/packages/domainatrex) [![License](http://img.shields.io/badge/license-MIT-brightgreen.svg)](http://opensource.org/licenses/MIT)
[![hex.pm version](https://img.shields.io/hexpm/v/domainatrex.svg)](https://hex.pm/packages/domainatrex) [![hex.pm downloads](https://img.shields.io/hexpm/dt/domainatrex.svg)](https://hex.pm/packages/domainatrex) [![License](http://img.shields.io/badge/license-MIT-brightgreen.svg)](http://opensource.org/licenses/MIT)

### [Read the docs](https://hexdocs.pm/domainatrex)



## Installation

Add the following to your `mix.exs`

```
```elixir
defp deps do
[{:domainatrex, "~> 3.0.3"}]
[
{:domainatrex, "~> 3.0.4"},
]
```

## Usage

`Domainatrex` should be able to handle all valid hostnames, it uses the [Public Suffix List](https://publicsuffix.org/list/) and is heavily inspired by the fantastic [Domainatrix](https://github.com/pauldix/domainatrix) library for Ruby
`Domainatrex` should be able to handle all valid hostnames, it uses the
[Public Suffix List](https://publicsuffix.org/list/) and is heavily inspired by the fantastic
[Domainatrix](https://github.com/pauldix/domainatrix) library for Ruby

```
```elixir
iex> Domainatrex.parse("someone.com")
{:ok, %{domain: "someone", subdomain: "", tld: "com"}}

Expand All @@ -34,21 +33,21 @@ iex> Domainatrex.parse("blog.someone.id.au")

## Configuration

For maximum performance, `Domainatrex` reads the list of all known top-level domains at compile time.
Likewise, by default, the package will attempt to fetch the latest list of TLDs from the web before
falling back to a local (potentially out of date) copy. You can configure this behavior in your
`config.exs` as follows:
For maximum performance, `Domainatrex` reads the list of all known top-level domains at compile
time. Likewise, by default, the package will attempt to fetch the latest list of TLDs from the
web before falling back to a local (potentially out of date) copy. You can configure this behavior
in your `config.exs` as follows:

- `:fetch_latest`: A Boolean flag to determine whether `Domainatrex` should try to fetch the latest
list of public suffixes at compile time; default is `true`
- `:public_suffix_list_url`: A charlist URL to the latest public suffix file that `Domainatrex` will
try to fetch at compile time; default is
`'https://raw.githubusercontent.com/publicsuffix/list/master/public_suffix_list.dat'`
- `:fallback_local_copy`: The path to the local suffix file that `Domainatrex` will use if it wasn't
able to fetch a fresh file from the URL, or if fetching updated files was disabled; default is
the `"lib/public_suffix_list.dat"` file included in the package.
- `:fetch_latest`: A Boolean flag to determine whether `Domainatrex` should try to fetch the
latest list of public suffixes at compile time; default is `true`
- `:public_suffix_list_url`: A charlist URL to the latest public suffix file that `Domainatrex`
will try to fetch at compile time; default is
`'https://raw.githubusercontent.com/publicsuffix/list/master/public_suffix_list.dat'`
- `:fallback_local_copy`: The path to the local suffix file that `Domainatrex` will use if it
wasn't able to fetch a fresh file from the URL, or if fetching updated files was disabled;
default is the `"lib/public_suffix_list.dat"` file included in the package.

Here's a complete example of how you might customize this behavior in your `confix.exs`:
Here's a complete example of how you might customize this behavior in your `config.exs`:

```elixir
config :domainatrex,
Expand All @@ -58,40 +57,3 @@ config :domainatrex,
public_suffix_list_url: 'https://publicsuffix.org/list/public_suffix_list.dat',
fallback_local_copy: "priv/my_app_custom_suffix_list.dat"
```

## Changelog

### 3.0.3
- Fix issue with new, longer domains from public_suffix_list.dat
### 3.0.1
- Resolve warnings about SSL and `Mix.Config` being deprecated.
### 3.0.0
- Breaking change: default to including private domains. `:include_private == false` is still respected (but defaults to false), and a new env var `:icann_only` is added and defaults to false.
### 2.4.0
- Support disabling compile time http request with `:fetch_latest` config (thanks @s3cur3 for the PR!)
### 2.3.0
- Bump deps
### 2.2.0
- Use `Logger` for logging

### 2.1.4
- Pin a version of `nimble_parsec` to fix a compilation error on `makeup` (`makeup` has fixed this downstream, so when `ex_doc` updates `makeup`, this will no longer be required)

### 2.1.3
- Merge a couple of minor PRs

### 2.1.2
- Improve tests and docs slightly

### 2.1.1
- Privatise `Domainatrex.match/n` and `Domainatrex.format_response/2` as they are only ever intended for internal use

### 2.1.0
- Better handle private domains. Private domains like `*.s3.amazonaws.com` are technically classed as TLDs (to my understanding?), it doesn't make a lot of sense to parse them this way.
- Fetch a new copy of the public suffix list from The Internet on compile, falling back to a (now updated!) local copy.

### 2.0.0
- Change the API from returning explicit results to {:ok, result} or {:error, result}. This is to be more uniform with other libraries I use and for better `with` usage. Sorry if this fucks up your day.

### 1.0.1
- Fully update the tests to reflect changes in `2.0.0` (thanks for the PR @pbonney!)
31 changes: 3 additions & 28 deletions config/config.exs
Original file line number Diff line number Diff line change
@@ -1,30 +1,5 @@
# This file is responsible for configuring your application
# and its dependencies with the aid of the Mix.Config module.
import Config

# This configuration is loaded before any dependency and is restricted
# to this project. If another project depends on this project, this
# file won't be loaded nor affect the parent project. For this reason,
# if you want to provide default values for your application for
# 3rd-party users, it should be done in your "mix.exs" file.

# You can configure for your application as:
#
# config :domainatrex, key: :value
#
# And access this configuration in your application as:
#
# Application.get_env(:domainatrex, :key)
#
# Or configure a 3rd-party app:
#
# config :logger, level: :info
#

# It is also possible to import configuration files, relative to this
# directory. For example, you can emulate configuration per environment
# by uncommenting the line below and defining dev.exs, test.exs and such.
# Configuration from the imported file will override the ones defined
# here (which is why it is important to import them last).
#
import_config "#{Mix.env}.exs"
if config_env() == :test do
config :domainatrex, custom_suffixes: ["localhost"]
end
1 change: 0 additions & 1 deletion config/dev.exs

This file was deleted.

1 change: 0 additions & 1 deletion config/prod.exs

This file was deleted.

4 changes: 0 additions & 4 deletions config/test.exs

This file was deleted.

92 changes: 66 additions & 26 deletions lib/domainatrex.ex
Original file line number Diff line number Diff line change
@@ -1,9 +1,21 @@
defmodule Domainatrex do
require Logger

@moduledoc """
Documentation for Domainatrex.
Documentation for Domainatrex
## Examples
iex> Domainatrex.parse("someone.com")
{:ok, %{domain: "someone", subdomain: "", tld: "com"}}
iex> Domainatrex.parse("blog.someone.id.au")
{:ok, %{domain: "someone", subdomain: "blog", tld: "id.au"}}
iex> Domainatrex.parse("zen.s3.amazonaws.com")
{:ok, %{domain: "s3", subdomain: "zen", tld: "amazonaws.com"}}
"""

require Logger

@fallback_local_copy Application.compile_env(
:domainatrex,
:fallback_local_copy,
Expand All @@ -20,7 +32,7 @@ defmodule Domainatrex do
Application.compile_env(
:domainatrex,
:public_suffix_list_url,
'https://raw.githubusercontent.com/publicsuffix/list/master/public_suffix_list.dat'
~c"https://raw.githubusercontent.com/publicsuffix/list/master/public_suffix_list.dat"
),
{:ok, {_, _, string}} <- :httpc.request(:get, {public_suffix_list_url, []}, [], []) do
@public_suffix_list to_string(string)
Expand Down Expand Up @@ -135,6 +147,19 @@ defmodule Domainatrex do
defp match([unquote(Enum.at(suffix, 0)), unquote(Enum.at(suffix, 1)) | _] = args) do
format_response(Enum.slice(args, 0, 6), Enum.slice(args, 6, 10))
end

7 ->
defp match([unquote(Enum.at(suffix, 0)), unquote(Enum.at(suffix, 1)), a]) do
format_response([unquote(Enum.at(suffix, 0)), unquote(Enum.at(suffix, 1))], [a])
end

defp match([unquote(Enum.at(suffix, 0)), unquote(Enum.at(suffix, 1)), a, b]) do
format_response([unquote(Enum.at(suffix, 0)), unquote(Enum.at(suffix, 1))], [a, b])
end

defp match([unquote(Enum.at(suffix, 0)), unquote(Enum.at(suffix, 1)) | _] = args) do
format_response(Enum.slice(args, 0, 7), Enum.slice(args, 7, 10))
end
end
else
case length(suffix) do
Expand Down Expand Up @@ -198,15 +223,15 @@ defmodule Domainatrex do

6 ->
defp match(
[
unquote(Enum.at(suffix, 0)),
unquote(Enum.at(suffix, 1)),
unquote(Enum.at(suffix, 2)),
unquote(Enum.at(suffix, 3)),
unquote(Enum.at(suffix, 4)),
unquote(Enum.at(suffix, 5)) | tail
] = args
) do
[
unquote(Enum.at(suffix, 0)),
unquote(Enum.at(suffix, 1)),
unquote(Enum.at(suffix, 2)),
unquote(Enum.at(suffix, 3)),
unquote(Enum.at(suffix, 4)),
unquote(Enum.at(suffix, 5)) | tail
] = args
) do
format_response(
[
Enum.at(args, 0),
Expand All @@ -220,8 +245,34 @@ defmodule Domainatrex do
)
end

7 ->
defp match(
[
unquote(Enum.at(suffix, 0)),
unquote(Enum.at(suffix, 1)),
unquote(Enum.at(suffix, 2)),
unquote(Enum.at(suffix, 3)),
unquote(Enum.at(suffix, 4)),
unquote(Enum.at(suffix, 5)),
unquote(Enum.at(suffix, 6)) | tail
] = args
) do
format_response(
[
Enum.at(args, 0),
Enum.at(args, 1),
Enum.at(args, 2),
Enum.at(args, 3),
Enum.at(args, 4),
Enum.at(args, 5),
Enum.at(args, 6)
],
tail
)
end

_ ->
{:error, "There exists a domain in the list which contains more than 5 dots: #{suffix}"}
{:error, "There exists a domain in the list which contains more than 7 dots: #{suffix}"}
end
end
end)
Expand All @@ -236,18 +287,7 @@ defmodule Domainatrex do
end
end

@doc """
## Examples
iex> Domainatrex.parse("someone.com")
{:ok, %{domain: "someone", subdomain: "", tld: "com"}}
iex> Domainatrex.parse("blog.someone.id.au")
{:ok, %{domain: "someone", subdomain: "blog", tld: "id.au"}}
iex> Domainatrex.parse("zen.s3.amazonaws.com")
{:ok, %{domain: "s3", subdomain: "zen", tld: "amazonaws.com"}}
"""
def parse(url) do
def parse(url) when is_binary(url) do
case String.length(url) > 1 && String.contains?(url, ".") do
true ->
adjusted_url = url |> String.split(".") |> Enum.reverse()
Expand Down
Loading

0 comments on commit 16f16a8

Please sign in to comment.