Releases: piskvorky/smart_open
v6.1.0
6.0.0
6.0.0, 24 April 2022
This release deprecates the old ignore_ext
parameter.
Use the compression
parameter instead.
fin = smart_open.open("/path/file.gz", ignore_ext=True) # 🚫 No
fin = smart_open.open("/path/file.gz", compression="disable") # Yes
fin = smart_open.open("/path/file.gz", ignore_ext=False) # 🚫 No
fin = smart_open.open("/path/file.gz") # Yes
fin = smart_open.open("/path/file.gz", compression="infer_from_extension") # Yes, if you want to be explicit
fin = smart_open.open("/path/file", compression=".gz") # Yes
- Make Python 3.7 the required minimum (PR #688, @mpenkov)
- Drop deprecated ignore_ext parameter (PR #661, @mpenkov)
- Drop support for passing buffers to smart_open.open (PR #660, @mpenkov)
- Support working directly with file descriptors (PR #659, @mpenkov)
- Added support for viewfs:// URLs (PR #665, @ChandanChainani)
- Fix AttributeError when reading passthrough zstandard (PR #658, @mpenkov)
- Make UploadFailedError picklable (PR #689, @birgerbr)
- Support container client and blob client for azure blob storage (PR #652, @cbare)
- Pin google-cloud-storage to >=1.31.1 in extras (PR #687, @PLPeeters)
- Expose certain transport-specific methods e.g. to_boto3 in top layer (PR #664, @mpenkov)
- Use pytest instead of parameterizedtestcase (PR #657, @mpenkov)
5.2.1, 28 August 2021
5.2.0, 18 August 2021
- Work around changes to
urllib.parse.urlsplit
(PR #633, @judahrand) - New blob_properties transport parameter for GCS (PR #632, @FHTheron)
- Don't leak compressed stream (PR #636, @ampanasiuk)
- Change python_requires version to fix PEP 440 issue (PR #639, @lucasvieirasilva)
- New max_concurrency transport parameter for azure (PR #642, @omBratteng)
5.1.0, 25 May 2021
This release introduces a new top-level parameter: compression
.
It controls compression behavior and partially overlaps with the old ignore_ext
parameter.
For details, see the README.rst file.
You may continue to use ignore_ext
parameter for now, but it will be deprecated in the next major release.
- Add warning for recently deprecated s3 parameters (PR #618, @mpenkov)
- Add new top-level compression parameter (PR #609, @dmcguire81)
- Drop mock dependency; standardize on unittest.mock (PR #621, @musicinmybrain)
- Fix to_boto3 method (PR #619, @mpenkov)
5.0.0, 30 Mar 2021
This release modifies the handling of transport parameters for the S3 back-end in a backwards-incompatible way.
See the migration docs for details.
- Refactor S3, replace high-level resource/session API with low-level client API (PR #583, @mpenkov)
- Fix potential infinite loop when reading from webhdfs (PR #597, @traboukos)
- Add timeout parameter for http/https (PR #594, @dustymugs)
- Remove
tests
directory from package (PR #589, @e-nalepa)
4.2.0, 15 Feb 2021
- Support tell() for text mode write on s3/gcs/azure (PR #582, @markopy)
- Implement option to use a custom buffer during S3 writes (PR #547, @mpenkov)
4.1.2, 18 Jan 2021
- Correctly pass boto3 resource to writers (PR #576, @jackluo923)
- Improve robustness of S3 reading (PR #552, @mpenkov)
- Replace codecs with TextIOWrapper to fix newline issues when reading text files (PR #578, @markopy)
4.1.0, 30 Dec 2020
- Refactor
s3
submodule to minimize resource usage (PR #569, @mpenkov) - Change
download_as_string
todownload_as_bytes
ingcs
submodule (PR #571, @alexandreyc)
4.0.1, 27 Nov 2020
- Exclude
requests
frominstall_requires
dependency list.
If you need it, usepip install smart_open[http]
orpip install smart_open[webhdfs]
.
4.0.0, 24 Nov 2020
- Fix reading empty file or seeking past end of file for s3 backend (PR #549, @jcushman)
- Fix handling of rt/wt mode when working with gzip compression (PR #559, @mpenkov)
- Bump minimum Python version to 3.6 (PR #562, @mpenkov)
3.0.0, 8 Oct 2020
This release modifies the behavior of setup.py with respect to dependencies.
Previously, boto3
and other AWS-related packages were installed by default.
Now, in order to install them, you need to run either:
pip install smart_open[s3]
to install the AWS dependencies only, or
pip install smart_open[all]
to install all dependencies, including AWS, GCS, etc.
2.2.1, 1 Oct 2020
- Include S3 dependencies by default, because removing them in the 2.2.0 minor release was a mistake.
2.2.0, 25 Sep 2020
This release modifies the behavior of setup.py with respect to dependencies.
Previously, boto3
and other AWS-related packages were installed by default.
Now, in order to install them, you need to run either:
pip install smart_open[s3]
to install the AWS dependencies only, or
pip install smart_open[all]
to install all dependencies, including AWS, GCS, etc.
Summary of changes:
- Correctly pass
newline
parameter to built-inopen
function (PR #478, @burkovae) - Remove boto as a dependency (PR #523, @isobit)
- Performance improvement: avoid redundant GetObject API queries in s3.Reader (PR #495, @jcushman)
- Support installing smart_open without AWS dependencies (PR #534, @justindujardin)
- Take object version into account in
to_boto3
method (PR #539, @interpolatio)
Deprecations
Functionality on the left hand side will be removed in future releases.
Use the functions on the right hand side instead.
smart_open.s3_iter_bucket
→smart_open.s3.iter_bucket
2.1.1, 27 Aug 2020
- Bypass unnecessary GCS storage.buckets.get permission (PR #516, @gelioz)
- Allow SFTP connection with SSH key (PR #522, @rostskadat)
2.1.0, 1 July 2020
- Azure storage blob support (@nclsmitchell and @petedannemann)
- Correctly pass
newline
parameter to built-inopen
function (PR #478, @burkovae) - Ensure GCS objects always have a .name att...
v5.2.1
5.2.1, 28 August 2021
5.2.0, 18 August 2021
- Work around changes to
urllib.parse.urlsplit
(PR #633, @judahrand) - New blob_properties transport parameter for GCS (PR #632, @FHTheron)
- Don't leak compressed stream (PR #636, @ampanasiuk)
- Change python_requires version to fix PEP 440 issue (PR #639, @lucasvieirasilva)
- New max_concurrency transport parameter for azure (PR #642, @omBratteng)
5.1.0, 25 May 2021
This release introduces a new top-level parameter: compression
.
It controls compression behavior and partially overlaps with the old ignore_ext
parameter.
For details, see the README.rst file.
You may continue to use ignore_ext
parameter for now, but it will be deprecated in the next major release.
- Add warning for recently deprecated s3 parameters (PR #618, @mpenkov)
- Add new top-level compression parameter (PR #609, @dmcguire81)
- Drop mock dependency; standardize on unittest.mock (PR #621, @musicinmybrain)
- Fix to_boto3 method (PR #619, @mpenkov)
5.0.0, 30 Mar 2021
This release modifies the handling of transport parameters for the S3 back-end in a backwards-incompatible way.
See the migration docs for details.
- Refactor S3, replace high-level resource/session API with low-level client API (PR #583, @mpenkov)
- Fix potential infinite loop when reading from webhdfs (PR #597, @traboukos)
- Add timeout parameter for http/https (PR #594, @dustymugs)
- Remove
tests
directory from package (PR #589, @e-nalepa)
4.2.0, 15 Feb 2021
- Support tell() for text mode write on s3/gcs/azure (PR #582, @markopy)
- Implement option to use a custom buffer during S3 writes (PR #547, @mpenkov)
4.1.2, 18 Jan 2021
- Correctly pass boto3 resource to writers (PR #576, @jackluo923)
- Improve robustness of S3 reading (PR #552, @mpenkov)
- Replace codecs with TextIOWrapper to fix newline issues when reading text files (PR #578, @markopy)
4.1.0, 30 Dec 2020
- Refactor
s3
submodule to minimize resource usage (PR #569, @mpenkov) - Change
download_as_string
todownload_as_bytes
ingcs
submodule (PR #571, @alexandreyc)
4.0.1, 27 Nov 2020
- Exclude
requests
frominstall_requires
dependency list.
If you need it, usepip install smart_open[http]
orpip install smart_open[webhdfs]
.
4.0.0, 24 Nov 2020
- Fix reading empty file or seeking past end of file for s3 backend (PR #549, @jcushman)
- Fix handling of rt/wt mode when working with gzip compression (PR #559, @mpenkov)
- Bump minimum Python version to 3.6 (PR #562, @mpenkov)
3.0.0, 8 Oct 2020
This release modifies the behavior of setup.py with respect to dependencies.
Previously, boto3
and other AWS-related packages were installed by default.
Now, in order to install them, you need to run either:
pip install smart_open[s3]
to install the AWS dependencies only, or
pip install smart_open[all]
to install all dependencies, including AWS, GCS, etc.
2.2.1, 1 Oct 2020
- Include S3 dependencies by default, because removing them in the 2.2.0 minor release was a mistake.
2.2.0, 25 Sep 2020
This release modifies the behavior of setup.py with respect to dependencies.
Previously, boto3
and other AWS-related packages were installed by default.
Now, in order to install them, you need to run either:
pip install smart_open[s3]
to install the AWS dependencies only, or
pip install smart_open[all]
to install all dependencies, including AWS, GCS, etc.
Summary of changes:
- Correctly pass
newline
parameter to built-inopen
function (PR #478, @burkovae) - Remove boto as a dependency (PR #523, @isobit)
- Performance improvement: avoid redundant GetObject API queries in s3.Reader (PR #495, @jcushman)
- Support installing smart_open without AWS dependencies (PR #534, @justindujardin)
- Take object version into account in
to_boto3
method (PR #539, @interpolatio)
Deprecations
Functionality on the left hand side will be removed in future releases.
Use the functions on the right hand side instead.
smart_open.s3_iter_bucket
→smart_open.s3.iter_bucket
2.1.1, 27 Aug 2020
- Bypass unnecessary GCS storage.buckets.get permission (PR #516, @gelioz)
- Allow SFTP connection with SSH key (PR #522, @rostskadat)
2.1.0, 1 July 2020
- Azure storage blob support (@nclsmitchell and @petedannemann)
- Correctly pass
newline
parameter to built-inopen
function (PR #478, @burkovae) - Ensure GCS objects always have a .name attribute (PR #506, @todor-markov)
- Use exception chaining to convey the original cause of the exception (PR #508, @cool-RR)
2.0.0, 27 April 2020, "Python 3"
- This version supports Python 3 only (3.5+).
- If you still need Python 2, install the smart_open==1.10.1 legacy release instead.
- Prevent smart_open from writing to logs on import (PR #476, @mpenkov)
- Modify setup.py to explicitly support only Py3.5 and above (PR #471, @Amertz08)
- Include all the test_data in setup.py (PR #473, @sikuan)
1.10.1, 26 April 2020
- This is the last version to support Python 2.7. Versions 1.11 and above will support Python 3 only.
- Use only if you need Python 2.
1.11.1, 8 Apr 2020
- Add missing boto dependency (Issue #468)
1.11.0, 8 Apr 2020
- Fix GCS multiple writes (PR #421, @petedannemann)
- Implemented efficient readline for ByteBuffer (PR #426, @mpenkov)
- Fix WebHDFS read method (PR #433, @mpenkov)
- Make S3 uploads more robust (PR #434, @mpenkov)
- Add pathlib monkeypatch with replacement of
pathlib.Path.open
(PR #436, @menshikh-iv) - Fix error when calling str() or repr() on GCS SeekableBufferedInputBase (PR #442, @robcowie)
- Move optional dependencies to extras (PR [#454](https://githu...
v5.2.0
5.2.0, 18 August 2021
- Work around changes to
urllib.parse.urlsplit
(PR #633, @judahrand) - New blob_properties transport parameter for GCS (PR #632, @FHTheron)
- Don't leak compressed stream (PR #636, @ampanasiuk)
- Change python_requires version to fix PEP 440 issue (PR #639, @lucasvieirasilva)
- New max_concurrency transport parameter for azure (PR #642, @omBratteng)
5.1.0, 25 May 2021
This release introduces a new top-level parameter: compression
.
It controls compression behavior and partially overlaps with the old ignore_ext
parameter.
For details, see the README.rst file.
You may continue to use ignore_ext
parameter for now, but it will be deprecated in the next major release.
- Add warning for recently deprecated s3 parameters (PR #618, @mpenkov)
- Add new top-level compression parameter (PR #609, @dmcguire81)
- Drop mock dependency; standardize on unittest.mock (PR #621, @musicinmybrain)
- Fix to_boto3 method (PR #619, @mpenkov)
5.0.0, 30 Mar 2021
This release modifies the handling of transport parameters for the S3 back-end in a backwards-incompatible way.
See the migration docs for details.
- Refactor S3, replace high-level resource/session API with low-level client API (PR #583, @mpenkov)
- Fix potential infinite loop when reading from webhdfs (PR #597, @traboukos)
- Add timeout parameter for http/https (PR #594, @dustymugs)
- Remove
tests
directory from package (PR #589, @e-nalepa)
4.2.0, 15 Feb 2021
- Support tell() for text mode write on s3/gcs/azure (PR #582, @markopy)
- Implement option to use a custom buffer during S3 writes (PR #547, @mpenkov)
4.1.2, 18 Jan 2021
- Correctly pass boto3 resource to writers (PR #576, @jackluo923)
- Improve robustness of S3 reading (PR #552, @mpenkov)
- Replace codecs with TextIOWrapper to fix newline issues when reading text files (PR #578, @markopy)
4.1.0, 30 Dec 2020
- Refactor
s3
submodule to minimize resource usage (PR #569, @mpenkov) - Change
download_as_string
todownload_as_bytes
ingcs
submodule (PR #571, @alexandreyc)
4.0.1, 27 Nov 2020
- Exclude
requests
frominstall_requires
dependency list.
If you need it, usepip install smart_open[http]
orpip install smart_open[webhdfs]
.
4.0.0, 24 Nov 2020
- Fix reading empty file or seeking past end of file for s3 backend (PR #549, @jcushman)
- Fix handling of rt/wt mode when working with gzip compression (PR #559, @mpenkov)
- Bump minimum Python version to 3.6 (PR #562, @mpenkov)
3.0.0, 8 Oct 2020
This release modifies the behavior of setup.py with respect to dependencies.
Previously, boto3
and other AWS-related packages were installed by default.
Now, in order to install them, you need to run either:
pip install smart_open[s3]
to install the AWS dependencies only, or
pip install smart_open[all]
to install all dependencies, including AWS, GCS, etc.
2.2.1, 1 Oct 2020
- Include S3 dependencies by default, because removing them in the 2.2.0 minor release was a mistake.
2.2.0, 25 Sep 2020
This release modifies the behavior of setup.py with respect to dependencies.
Previously, boto3
and other AWS-related packages were installed by default.
Now, in order to install them, you need to run either:
pip install smart_open[s3]
to install the AWS dependencies only, or
pip install smart_open[all]
to install all dependencies, including AWS, GCS, etc.
Summary of changes:
- Correctly pass
newline
parameter to built-inopen
function (PR #478, @burkovae) - Remove boto as a dependency (PR #523, @isobit)
- Performance improvement: avoid redundant GetObject API queries in s3.Reader (PR #495, @jcushman)
- Support installing smart_open without AWS dependencies (PR #534, @justindujardin)
- Take object version into account in
to_boto3
method (PR #539, @interpolatio)
Deprecations
Functionality on the left hand side will be removed in future releases.
Use the functions on the right hand side instead.
smart_open.s3_iter_bucket
→smart_open.s3.iter_bucket
2.1.1, 27 Aug 2020
- Bypass unnecessary GCS storage.buckets.get permission (PR #516, @gelioz)
- Allow SFTP connection with SSH key (PR #522, @rostskadat)
2.1.0, 1 July 2020
- Azure storage blob support (@nclsmitchell and @petedannemann)
- Correctly pass
newline
parameter to built-inopen
function (PR #478, @burkovae) - Ensure GCS objects always have a .name attribute (PR #506, @todor-markov)
- Use exception chaining to convey the original cause of the exception (PR #508, @cool-RR)
2.0.0, 27 April 2020, "Python 3"
- This version supports Python 3 only (3.5+).
- If you still need Python 2, install the smart_open==1.10.1 legacy release instead.
- Prevent smart_open from writing to logs on import (PR #476, @mpenkov)
- Modify setup.py to explicitly support only Py3.5 and above (PR #471, @Amertz08)
- Include all the test_data in setup.py (PR #473, @sikuan)
1.10.1, 26 April 2020
- This is the last version to support Python 2.7. Versions 1.11 and above will support Python 3 only.
- Use only if you need Python 2.
1.11.1, 8 Apr 2020
- Add missing boto dependency (Issue #468)
1.11.0, 8 Apr 2020
- Fix GCS multiple writes (PR #421, @petedannemann)
- Implemented efficient readline for ByteBuffer (PR #426, @mpenkov)
- Fix WebHDFS read method (PR #433, @mpenkov)
- Make S3 uploads more robust (PR #434, @mpenkov)
- Add pathlib monkeypatch with replacement of
pathlib.Path.open
(PR #436, @menshikh-iv) - Fix error when calling str() or repr() on GCS SeekableBufferedInputBase (PR #442, @robcowie)
- Move optional dependencies to extras (PR #454, @Amertz08)
- Correctly handle GCS paths that contain '?' char (PR [#460](https://github.com/R...
v5.1.0
5.1.0, 25 May 2021
This release introduces a new top-level parameter: compression
.
It controls compression behavior and partially overlaps with the old ignore_ext
parameter.
For details, see the README.rst file.
You may continue to use ignore_ext
parameter for now, but it will be deprecated in the next major release.
- Add warning for recently deprecated s3 parameters (PR #618, @mpenkov)
- Add new top-level compression parameter (PR #609, @dmcguire81)
- Drop mock dependency; standardize on unittest.mock (PR #621, @musicinmybrain)
- Fix to_boto3 method (PR #619, @mpenkov)
5.0.0, 30 Mar 2021
This release modifies the handling of transport parameters for the S3 back-end in a backwards-incompatible way.
See the migration docs for details.
- Refactor S3, replace high-level resource/session API with low-level client API (PR #583, @mpenkov)
- Fix potential infinite loop when reading from webhdfs (PR #597, @traboukos)
- Add timeout parameter for http/https (PR #594, @dustymugs)
- Remove
tests
directory from package (PR #589, @e-nalepa)
4.2.0, 15 Feb 2021
- Support tell() for text mode write on s3/gcs/azure (PR #582, @markopy)
- Implement option to use a custom buffer during S3 writes (PR #547, @mpenkov)
4.1.2, 18 Jan 2021
- Correctly pass boto3 resource to writers (PR #576, @jackluo923)
- Improve robustness of S3 reading (PR #552, @mpenkov)
- Replace codecs with TextIOWrapper to fix newline issues when reading text files (PR #578, @markopy)
4.1.0, 30 Dec 2020
- Refactor
s3
submodule to minimize resource usage (PR #569, @mpenkov) - Change
download_as_string
todownload_as_bytes
ingcs
submodule (PR #571, @alexandreyc)
4.0.1, 27 Nov 2020
- Exclude
requests
frominstall_requires
dependency list.
If you need it, usepip install smart_open[http]
orpip install smart_open[webhdfs]
.
4.0.0, 24 Nov 2020
- Fix reading empty file or seeking past end of file for s3 backend (PR #549, @jcushman)
- Fix handling of rt/wt mode when working with gzip compression (PR #559, @mpenkov)
- Bump minimum Python version to 3.6 (PR #562, @mpenkov)
3.0.0, 8 Oct 2020
This release modifies the behavior of setup.py with respect to dependencies.
Previously, boto3
and other AWS-related packages were installed by default.
Now, in order to install them, you need to run either:
pip install smart_open[s3]
to install the AWS dependencies only, or
pip install smart_open[all]
to install all dependencies, including AWS, GCS, etc.
2.2.1, 1 Oct 2020
- Include S3 dependencies by default, because removing them in the 2.2.0 minor release was a mistake.
2.2.0, 25 Sep 2020
This release modifies the behavior of setup.py with respect to dependencies.
Previously, boto3
and other AWS-related packages were installed by default.
Now, in order to install them, you need to run either:
pip install smart_open[s3]
to install the AWS dependencies only, or
pip install smart_open[all]
to install all dependencies, including AWS, GCS, etc.
Summary of changes:
- Correctly pass
newline
parameter to built-inopen
function (PR #478, @burkovae) - Remove boto as a dependency (PR #523, @isobit)
- Performance improvement: avoid redundant GetObject API queries in s3.Reader (PR #495, @jcushman)
- Support installing smart_open without AWS dependencies (PR #534, @justindujardin)
- Take object version into account in
to_boto3
method (PR #539, @interpolatio)
Deprecations
Functionality on the left hand side will be removed in future releases.
Use the functions on the right hand side instead.
smart_open.s3_iter_bucket
→smart_open.s3.iter_bucket
2.1.1, 27 Aug 2020
- Bypass unnecessary GCS storage.buckets.get permission (PR #516, @gelioz)
- Allow SFTP connection with SSH key (PR #522, @rostskadat)
2.1.0, 1 July 2020
- Azure storage blob support (@nclsmitchell and @petedannemann)
- Correctly pass
newline
parameter to built-inopen
function (PR #478, @burkovae) - Ensure GCS objects always have a .name attribute (PR #506, @todor-markov)
- Use exception chaining to convey the original cause of the exception (PR #508, @cool-RR)
2.0.0, 27 April 2020, "Python 3"
- This version supports Python 3 only (3.5+).
- If you still need Python 2, install the smart_open==1.10.1 legacy release instead.
- Prevent smart_open from writing to logs on import (PR #476, @mpenkov)
- Modify setup.py to explicitly support only Py3.5 and above (PR #471, @Amertz08)
- Include all the test_data in setup.py (PR #473, @sikuan)
1.10.1, 26 April 2020
- This is the last version to support Python 2.7. Versions 1.11 and above will support Python 3 only.
- Use only if you need Python 2.
1.11.1, 8 Apr 2020
- Add missing boto dependency (Issue #468)
1.11.0, 8 Apr 2020
- Fix GCS multiple writes (PR #421, @petedannemann)
- Implemented efficient readline for ByteBuffer (PR #426, @mpenkov)
- Fix WebHDFS read method (PR #433, @mpenkov)
- Make S3 uploads more robust (PR #434, @mpenkov)
- Add pathlib monkeypatch with replacement of
pathlib.Path.open
(PR #436, @menshikh-iv) - Fix error when calling str() or repr() on GCS SeekableBufferedInputBase (PR #442, @robcowie)
- Move optional dependencies to extras (PR #454, @Amertz08)
- Correctly handle GCS paths that contain '?' char (PR #460, @chakruperitus)
- Make our doctools submodule more robust (PR #467, @mpenkov)
Starting with this release, you will have to run:
pip install smart_open[gcs] to use the GCS transport.
In the future, all extra dependencies will be optional. If you want to continue installing all of them, use:
pip install smart_open[all]
See the README.rst for details.
1.10.0, 16 Mar 2020
5.0.0
5.0.0, 30 Mar 2021
This release modifies the handling of transport parameters for the S3 back-end in a backwards-incompatible way.
See the migration docs for details.
- Refactor S3, replace high-level resource/session API with low-level client API (PR #583, @mpenkov)
- Fix potential infinite loop when reading from webhdfs (PR #597, @traboukos)
- Add timeout parameter for http/https (PR #594, @dustymugs)
- Remove
tests
directory from package (PR #589, @e-nalepa)
4.2.0, 15 Feb 2021
- Support tell() for text mode write on s3/gcs/azure (PR #582, @markopy)
- Implement option to use a custom buffer during S3 writes (PR #547, @mpenkov)
4.1.2, 18 Jan 2021
- Correctly pass boto3 resource to writers (PR #576, @jackluo923)
- Improve robustness of S3 reading (PR #552, @mpenkov)
- Replace codecs with TextIOWrapper to fix newline issues when reading text files (PR #578, @markopy)
4.1.0, 30 Dec 2020
- Refactor
s3
submodule to minimize resource usage (PR #569, @mpenkov) - Change
download_as_string
todownload_as_bytes
ingcs
submodule (PR #571, @alexandreyc)
4.0.1, 27 Nov 2020
- Exclude
requests
frominstall_requires
dependency list.
If you need it, usepip install smart_open[http]
orpip install smart_open[webhdfs]
.
4.0.0, 24 Nov 2020
- Fix reading empty file or seeking past end of file for s3 backend (PR #549, @jcushman)
- Fix handling of rt/wt mode when working with gzip compression (PR #559, @mpenkov)
- Bump minimum Python version to 3.6 (PR #562, @mpenkov)
3.0.0, 8 Oct 2020
This release modifies the behavior of setup.py with respect to dependencies.
Previously, boto3
and other AWS-related packages were installed by default.
Now, in order to install them, you need to run either:
pip install smart_open[s3]
to install the AWS dependencies only, or
pip install smart_open[all]
to install all dependencies, including AWS, GCS, etc.
2.2.1, 1 Oct 2020
- Include S3 dependencies by default, because removing them in the 2.2.0 minor release was a mistake.
2.2.0, 25 Sep 2020
This release modifies the behavior of setup.py with respect to dependencies.
Previously, boto3
and other AWS-related packages were installed by default.
Now, in order to install them, you need to run either:
pip install smart_open[s3]
to install the AWS dependencies only, or
pip install smart_open[all]
to install all dependencies, including AWS, GCS, etc.
Summary of changes:
- Correctly pass
newline
parameter to built-inopen
function (PR #478, @burkovae) - Remove boto as a dependency (PR #523, @isobit)
- Performance improvement: avoid redundant GetObject API queries in s3.Reader (PR #495, @jcushman)
- Support installing smart_open without AWS dependencies (PR #534, @justindujardin)
- Take object version into account in
to_boto3
method (PR #539, @interpolatio)
Deprecations
Functionality on the left hand side will be removed in future releases.
Use the functions on the right hand side instead.
smart_open.s3_iter_bucket
→smart_open.s3.iter_bucket
2.1.1, 27 Aug 2020
- Bypass unnecessary GCS storage.buckets.get permission (PR #516, @gelioz)
- Allow SFTP connection with SSH key (PR #522, @rostskadat)
2.1.0, 1 July 2020
- Azure storage blob support (@nclsmitchell and @petedannemann)
- Correctly pass
newline
parameter to built-inopen
function (PR #478, @burkovae) - Ensure GCS objects always have a .name attribute (PR #506, @todor-markov)
- Use exception chaining to convey the original cause of the exception (PR #508, @cool-RR)
2.0.0, 27 April 2020, "Python 3"
- This version supports Python 3 only (3.5+).
- If you still need Python 2, install the smart_open==1.10.1 legacy release instead.
- Prevent smart_open from writing to logs on import (PR #476, @mpenkov)
- Modify setup.py to explicitly support only Py3.5 and above (PR #471, @Amertz08)
- Include all the test_data in setup.py (PR #473, @sikuan)
1.10.1, 26 April 2020
- This is the last version to support Python 2.7. Versions 1.11 and above will support Python 3 only.
- Use only if you need Python 2.
1.11.1, 8 Apr 2020
- Add missing boto dependency (Issue #468)
1.11.0, 8 Apr 2020
- Fix GCS multiple writes (PR #421, @petedannemann)
- Implemented efficient readline for ByteBuffer (PR #426, @mpenkov)
- Fix WebHDFS read method (PR #433, @mpenkov)
- Make S3 uploads more robust (PR #434, @mpenkov)
- Add pathlib monkeypatch with replacement of
pathlib.Path.open
(PR #436, @menshikh-iv) - Fix error when calling str() or repr() on GCS SeekableBufferedInputBase (PR #442, @robcowie)
- Move optional dependencies to extras (PR #454, @Amertz08)
- Correctly handle GCS paths that contain '?' char (PR #460, @chakruperitus)
- Make our doctools submodule more robust (PR #467, @mpenkov)
Starting with this release, you will have to run:
pip install smart_open[gcs] to use the GCS transport.
In the future, all extra dependencies will be optional. If you want to continue installing all of them, use:
pip install smart_open[all]
See the README.rst for details.
1.10.0, 16 Mar 2020
- Various webhdfs improvements (PR #383, @mrk-its)
- Fixes "the connection was closed by the remote peer" error (PR #389, @Gapex)
- allow use of S3 single part uploads (PR #400, @adrpar)
- Add test data in package via MANIFEST.in (PR #401, @jayvdb)
- Google Cloud Storage (GCS) (PR #404, @petedannemann)
- Implement to_boto3 function for S3 I/O. (PR #405, @mpenkov)
- enable smart_open to operate without docstrings (PR #406, @mpenkov)
- Implement object_kwargs parameter (PR #411, @mpenkov)
- Remove dependency...
4.2.0
Unreleased
4.2.0, 15 Feb 2021
- Support tell() for text mode write on s3/gcs/azure (PR #582, @markopy)
- Implement option to use a custom buffer during S3 writes (PR #547, @mpenkov)
4.1.2, 18 Jan 2021
- Correctly pass boto3 resource to writers (PR #576, @jackluo923)
- Improve robustness of S3 reading (PR #552, @mpenkov)
- Replace codecs with TextIOWrapper to fix newline issues when reading text files (PR #578, @markopy)
4.1.0, 30 Dec 2020
- Refactor
s3
submodule to minimize resource usage (PR #569, @mpenkov) - Change
download_as_string
todownload_as_bytes
ingcs
submodule (PR #571, @alexandreyc)
4.0.1, 27 Nov 2020
- Exclude
requests
frominstall_requires
dependency list.
If you need it, usepip install smart_open[http]
orpip install smart_open[webhdfs]
.
4.0.0, 24 Nov 2020
- Fix reading empty file or seeking past end of file for s3 backend (PR #549, @jcushman)
- Fix handling of rt/wt mode when working with gzip compression (PR #559, @mpenkov)
- Bump minimum Python version to 3.6 (PR #562, @mpenkov)
3.0.0, 8 Oct 2020
This release modifies the behavior of setup.py with respect to dependencies.
Previously, boto3
and other AWS-related packages were installed by default.
Now, in order to install them, you need to run either:
pip install smart_open[s3]
to install the AWS dependencies only, or
pip install smart_open[all]
to install all dependencies, including AWS, GCS, etc.
2.2.1, 1 Oct 2020
- Include S3 dependencies by default, because removing them in the 2.2.0 minor release was a mistake.
2.2.0, 25 Sep 2020
This release modifies the behavior of setup.py with respect to dependencies.
Previously, boto3
and other AWS-related packages were installed by default.
Now, in order to install them, you need to run either:
pip install smart_open[s3]
to install the AWS dependencies only, or
pip install smart_open[all]
to install all dependencies, including AWS, GCS, etc.
Summary of changes:
- Correctly pass
newline
parameter to built-inopen
function (PR #478, @burkovae) - Remove boto as a dependency (PR #523, @isobit)
- Performance improvement: avoid redundant GetObject API queries in s3.Reader (PR #495, @jcushman)
- Support installing smart_open without AWS dependencies (PR #534, @justindujardin)
- Take object version into account in
to_boto3
method (PR #539, @interpolatio)
Deprecations
Functionality on the left hand side will be removed in future releases.
Use the functions on the right hand side instead.
smart_open.s3_iter_bucket
→smart_open.s3.iter_bucket
2.1.1, 27 Aug 2020
- Bypass unnecessary GCS storage.buckets.get permission (PR #516, @gelioz)
- Allow SFTP connection with SSH key (PR #522, @rostskadat)
2.1.0, 1 July 2020
- Azure storage blob support (@nclsmitchell and @petedannemann)
- Correctly pass
newline
parameter to built-inopen
function (PR #478, @burkovae) - Ensure GCS objects always have a .name attribute (PR #506, @todor-markov)
- Use exception chaining to convey the original cause of the exception (PR #508, @cool-RR)
2.0.0, 27 April 2020, "Python 3"
- This version supports Python 3 only (3.5+).
- If you still need Python 2, install the smart_open==1.10.1 legacy release instead.
- Prevent smart_open from writing to logs on import (PR #476, @mpenkov)
- Modify setup.py to explicitly support only Py3.5 and above (PR #471, @Amertz08)
- Include all the test_data in setup.py (PR #473, @sikuan)
1.10.1, 26 April 2020
- This is the last version to support Python 2.7. Versions 1.11 and above will support Python 3 only.
- Use only if you need Python 2.
1.11.1, 8 Apr 2020
- Add missing boto dependency (Issue #468)
1.11.0, 8 Apr 2020
- Fix GCS multiple writes (PR #421, @petedannemann)
- Implemented efficient readline for ByteBuffer (PR #426, @mpenkov)
- Fix WebHDFS read method (PR #433, @mpenkov)
- Make S3 uploads more robust (PR #434, @mpenkov)
- Add pathlib monkeypatch with replacement of
pathlib.Path.open
(PR #436, @menshikh-iv) - Fix error when calling str() or repr() on GCS SeekableBufferedInputBase (PR #442, @robcowie)
- Move optional dependencies to extras (PR #454, @Amertz08)
- Correctly handle GCS paths that contain '?' char (PR #460, @chakruperitus)
- Make our doctools submodule more robust (PR #467, @mpenkov)
Starting with this release, you will have to run:
pip install smart_open[gcs] to use the GCS transport.
In the future, all extra dependencies will be optional. If you want to continue installing all of them, use:
pip install smart_open[all]
See the README.rst for details.
1.10.0, 16 Mar 2020
- Various webhdfs improvements (PR #383, @mrk-its)
- Fixes "the connection was closed by the remote peer" error (PR #389, @Gapex)
- allow use of S3 single part uploads (PR #400, @adrpar)
- Add test data in package via MANIFEST.in (PR #401, @jayvdb)
- Google Cloud Storage (GCS) (PR #404, @petedannemann)
- Implement to_boto3 function for S3 I/O. (PR #405, @mpenkov)
- enable smart_open to operate without docstrings (PR #406, @mpenkov)
- Implement object_kwargs parameter (PR #411, @mpenkov)
- Remove dependency on old boto library (PR #413, @mpenkov)
- implemented efficient readline for ByteBuffer (PR #426, @mpenkov)
- improve buffering efficiency (PR #427, @mpenkov)
- fix WebHDFS read method (PR #433, @mpenkov)
- Make S3 uploads more robust (PR #434, @mpenkov)
1.9.0, 3 Nov 2019
- Add version_id transport parameter for fetching a specific S3 object version (PR [#325](https://github.com/RaRe-Technolog...
v4.1.2
Unreleased
4.1.2, 18 Jan 2021
- Correctly pass boto3 resource to writers (PR #576, @jackluo923)
- Improve robustness of S3 reading (PR #552, @mpenkov)
- Replace codecs with TextIOWrapper to fix newline issues when reading text files (PR #578, @markopy)
4.1.0, 30 Dec 2020
- Refactor
s3
submodule to minimize resource usage (PR #569, @mpenkov) - Change
download_as_string
todownload_as_bytes
ingcs
submodule (PR #571, @alexandreyc)
4.0.1, 27 Nov 2020
- Exclude
requests
frominstall_requires
dependency list.
If you need it, usepip install smart_open[http]
orpip install smart_open[webhdfs]
.
4.0.0, 24 Nov 2020
- Fix reading empty file or seeking past end of file for s3 backend (PR #549, @jcushman)
- Fix handling of rt/wt mode when working with gzip compression (PR #559, @mpenkov)
- Bump minimum Python version to 3.6 (PR #562, @mpenkov)
3.0.0, 8 Oct 2020
This release modifies the behavior of setup.py with respect to dependencies.
Previously, boto3
and other AWS-related packages were installed by default.
Now, in order to install them, you need to run either:
pip install smart_open[s3]
to install the AWS dependencies only, or
pip install smart_open[all]
to install all dependencies, including AWS, GCS, etc.
2.2.1, 1 Oct 2020
- Include S3 dependencies by default, because removing them in the 2.2.0 minor release was a mistake.
2.2.0, 25 Sep 2020
This release modifies the behavior of setup.py with respect to dependencies.
Previously, boto3
and other AWS-related packages were installed by default.
Now, in order to install them, you need to run either:
pip install smart_open[s3]
to install the AWS dependencies only, or
pip install smart_open[all]
to install all dependencies, including AWS, GCS, etc.
Summary of changes:
- Correctly pass
newline
parameter to built-inopen
function (PR #478, @burkovae) - Remove boto as a dependency (PR #523, @isobit)
- Performance improvement: avoid redundant GetObject API queries in s3.Reader (PR #495, @jcushman)
- Support installing smart_open without AWS dependencies (PR #534, @justindujardin)
- Take object version into account in
to_boto3
method (PR #539, @interpolatio)
Deprecations
Functionality on the left hand side will be removed in future releases.
Use the functions on the right hand side instead.
smart_open.s3_iter_bucket
→smart_open.s3.iter_bucket
2.1.1, 27 Aug 2020
- Bypass unnecessary GCS storage.buckets.get permission (PR #516, @gelioz)
- Allow SFTP connection with SSH key (PR #522, @rostskadat)
2.1.0, 1 July 2020
- Azure storage blob support (@nclsmitchell and @petedannemann)
- Correctly pass
newline
parameter to built-inopen
function (PR #478, @burkovae) - Ensure GCS objects always have a .name attribute (PR #506, @todor-markov)
- Use exception chaining to convey the original cause of the exception (PR #508, @cool-RR)
2.0.0, 27 April 2020, "Python 3"
- This version supports Python 3 only (3.5+).
- If you still need Python 2, install the smart_open==1.10.1 legacy release instead.
- Prevent smart_open from writing to logs on import (PR #476, @mpenkov)
- Modify setup.py to explicitly support only Py3.5 and above (PR #471, @Amertz08)
- Include all the test_data in setup.py (PR #473, @sikuan)
1.10.1, 26 April 2020
- This is the last version to support Python 2.7. Versions 1.11 and above will support Python 3 only.
- Use only if you need Python 2.
1.11.1, 8 Apr 2020
- Add missing boto dependency (Issue #468)
1.11.0, 8 Apr 2020
- Fix GCS multiple writes (PR #421, @petedannemann)
- Implemented efficient readline for ByteBuffer (PR #426, @mpenkov)
- Fix WebHDFS read method (PR #433, @mpenkov)
- Make S3 uploads more robust (PR #434, @mpenkov)
- Add pathlib monkeypatch with replacement of
pathlib.Path.open
(PR #436, @menshikh-iv) - Fix error when calling str() or repr() on GCS SeekableBufferedInputBase (PR #442, @robcowie)
- Move optional dependencies to extras (PR #454, @Amertz08)
- Correctly handle GCS paths that contain '?' char (PR #460, @chakruperitus)
- Make our doctools submodule more robust (PR #467, @mpenkov)
Starting with this release, you will have to run:
pip install smart_open[gcs] to use the GCS transport.
In the future, all extra dependencies will be optional. If you want to continue installing all of them, use:
pip install smart_open[all]
See the README.rst for details.
1.10.0, 16 Mar 2020
- Various webhdfs improvements (PR #383, @mrk-its)
- Fixes "the connection was closed by the remote peer" error (PR #389, @Gapex)
- allow use of S3 single part uploads (PR #400, @adrpar)
- Add test data in package via MANIFEST.in (PR #401, @jayvdb)
- Google Cloud Storage (GCS) (PR #404, @petedannemann)
- Implement to_boto3 function for S3 I/O. (PR #405, @mpenkov)
- enable smart_open to operate without docstrings (PR #406, @mpenkov)
- Implement object_kwargs parameter (PR #411, @mpenkov)
- Remove dependency on old boto library (PR #413, @mpenkov)
- implemented efficient readline for ByteBuffer (PR #426, @mpenkov)
- improve buffering efficiency (PR #427, @mpenkov)
- fix WebHDFS read method (PR #433, @mpenkov)
- Make S3 uploads more robust (PR #434, @mpenkov)
1.9.0, 3 Nov 2019
- Add version_id transport parameter for fetching a specific S3 object version (PR #325, @interpolatio)
- Document passthrough use case (PR #333, @mpenkov)
- Support seeking over HTTP and HTTPS (PR #339, [@interpolatio](https://gith...
4.1.0
4.1.0, 30 Dec 2020
- Refactor
s3
submodule to minimize resource usage (PR #569, @mpenkov) - Change
download_as_string
todownload_as_bytes
ingcs
submodule (PR #571, @alexandreyc)
4.0.1, 27 Nov 2020
- Exclude
requests
frominstall_requires
dependency list.
If you need it, usepip install smart_open[http]
orpip install smart_open[webhdfs]
.
4.0.0, 24 Nov 2020
- Fix reading empty file or seeking past end of file for s3 backend (PR #549, @jcushman)
- Fix handling of rt/wt mode when working with gzip compression (PR #559, @mpenkov)
- Bump minimum Python version to 3.6 (PR #562, @mpenkov)
3.0.0, 8 Oct 2020
This release modifies the behavior of setup.py with respect to dependencies.
Previously, boto3
and other AWS-related packages were installed by default.
Now, in order to install them, you need to run either:
pip install smart_open[s3]
to install the AWS dependencies only, or
pip install smart_open[all]
to install all dependencies, including AWS, GCS, etc.
2.2.1, 1 Oct 2020
- Include S3 dependencies by default, because removing them in the 2.2.0 minor release was a mistake.
2.2.0, 25 Sep 2020
This release modifies the behavior of setup.py with respect to dependencies.
Previously, boto3
and other AWS-related packages were installed by default.
Now, in order to install them, you need to run either:
pip install smart_open[s3]
to install the AWS dependencies only, or
pip install smart_open[all]
to install all dependencies, including AWS, GCS, etc.
Summary of changes:
- Correctly pass
newline
parameter to built-inopen
function (PR #478, @burkovae) - Remove boto as a dependency (PR #523, @isobit)
- Performance improvement: avoid redundant GetObject API queries in s3.Reader (PR #495, @jcushman)
- Support installing smart_open without AWS dependencies (PR #534, @justindujardin)
- Take object version into account in
to_boto3
method (PR #539, @interpolatio)
Deprecations
Functionality on the left hand side will be removed in future releases.
Use the functions on the right hand side instead.
smart_open.s3_iter_bucket
→smart_open.s3.iter_bucket
2.1.1, 27 Aug 2020
- Bypass unnecessary GCS storage.buckets.get permission (PR #516, @gelioz)
- Allow SFTP connection with SSH key (PR #522, @rostskadat)
2.1.0, 1 July 2020
- Azure storage blob support (@nclsmitchell and @petedannemann)
- Correctly pass
newline
parameter to built-inopen
function (PR #478, @burkovae) - Ensure GCS objects always have a .name attribute (PR #506, @todor-markov)
- Use exception chaining to convey the original cause of the exception (PR #508, @cool-RR)
2.0.0, 27 April 2020, "Python 3"
- This version supports Python 3 only (3.5+).
- If you still need Python 2, install the smart_open==1.10.1 legacy release instead.
- Prevent smart_open from writing to logs on import (PR #476, @mpenkov)
- Modify setup.py to explicitly support only Py3.5 and above (PR #471, @Amertz08)
- Include all the test_data in setup.py (PR #473, @sikuan)
1.10.1, 26 April 2020
- This is the last version to support Python 2.7. Versions 1.11 and above will support Python 3 only.
- Use only if you need Python 2.
1.11.1, 8 Apr 2020
- Add missing boto dependency (Issue #468)
1.11.0, 8 Apr 2020
- Fix GCS multiple writes (PR #421, @petedannemann)
- Implemented efficient readline for ByteBuffer (PR #426, @mpenkov)
- Fix WebHDFS read method (PR #433, @mpenkov)
- Make S3 uploads more robust (PR #434, @mpenkov)
- Add pathlib monkeypatch with replacement of
pathlib.Path.open
(PR #436, @menshikh-iv) - Fix error when calling str() or repr() on GCS SeekableBufferedInputBase (PR #442, @robcowie)
- Move optional dependencies to extras (PR #454, @Amertz08)
- Correctly handle GCS paths that contain '?' char (PR #460, @chakruperitus)
- Make our doctools submodule more robust (PR #467, @mpenkov)
Starting with this release, you will have to run:
pip install smart_open[gcs] to use the GCS transport.
In the future, all extra dependencies will be optional. If you want to continue installing all of them, use:
pip install smart_open[all]
See the README.rst for details.
1.10.0, 16 Mar 2020
- Various webhdfs improvements (PR #383, @mrk-its)
- Fixes "the connection was closed by the remote peer" error (PR #389, @Gapex)
- allow use of S3 single part uploads (PR #400, @adrpar)
- Add test data in package via MANIFEST.in (PR #401, @jayvdb)
- Google Cloud Storage (GCS) (PR #404, @petedannemann)
- Implement to_boto3 function for S3 I/O. (PR #405, @mpenkov)
- enable smart_open to operate without docstrings (PR #406, @mpenkov)
- Implement object_kwargs parameter (PR #411, @mpenkov)
- Remove dependency on old boto library (PR #413, @mpenkov)
- implemented efficient readline for ByteBuffer (PR #426, @mpenkov)
- improve buffering efficiency (PR #427, @mpenkov)
- fix WebHDFS read method (PR #433, @mpenkov)
- Make S3 uploads more robust (PR #434, @mpenkov)
1.9.0, 3 Nov 2019
- Add version_id transport parameter for fetching a specific S3 object version (PR #325, @interpolatio)
- Document passthrough use case (PR #333, @mpenkov)
- Support seeking over HTTP and HTTPS (PR #339, @interpolatio)
- Add support for rt, rt+, wt, wt+, at, at+ methods (PR #342, @interpolatio)
- Change VERSION to version.py (PR #349, @mpenkov)
- Adding howto guides (PR #355, @mpenkov)
- smart_open/s3: Initial implementations of str and repr (PR [...