When I started learning Python, there was one thing I always had trouble with: dealing with directories and file paths!
I remember the struggle to manipulate paths as strings using the os
module. I was constantly looking up error messages related to improper path manipulation.
The os
module never felt intuitive and ergonomic to me, but my luck changed when pathlib
landed in Python 3.4. It was a breath of fresh air, much easier to use, and felt more Pythonic to me.
The only problem was: finding examples on how to use it was hard; the documentation only covered a few use cases. And yes, Python’s docs are good, but for newcomers, examples are a must.
Even though the docs are much better now, they don’t showcase the module in a problem-solving fashion. That’s why I decided to create this cookbook.
This article is a brain dump of everything I know about pathlib
. It’s meant to be a reference rather than a linear guide. Feel free to jump around to sections that are more relevant to you.
In this guide, we’ll go over dozens of use cases such as:
- how to create (touch) an empty file
- how to convert a path to string
- getting the home directory
- creating new directories, doing it recursively, and dealing with issues when they
- getting the current working directory
- get the file extension from a filename
- get the parent directory of a file or script
- read and write text or binary files
- how to delete files
- how create nested directories
- how to list all files and folders in a directory
- how to list all subdirectories recursively
- how to remove a directory along with its contents
I hope you enjoy!
Table of contents
- What is
pathlib
in Python? - The anatomy of a
pathlib.Path
- How to convert a path to string
- How to join a path by adding parts or other paths
- Working with directories using
pathlib
- How to get the current working directory (cwd) with
pathlib
- How to get the home directory with
pathlib
- How to expand the initial path component with
Path.expanduser()
- How to list all files and directories
- Using
isdir
to list only the directories - Getting a list of all subdirectories in the current directory recursively
- How to recursively iterate through all files
- How to change directories with Python pathlib
- How to delete directories with
pathlib
- How to remove a directory along with its contents with
pathlib
- How to get the current working directory (cwd) with
- Working with files using
pathlib
- How to touch a file and create parent directories
- How to get the filename from path
- How to get the file extension from a filename using
pathlib
- How to open a file for reading with
pathlib
- How to read text files with
pathlib
- How to read JSON files from path with
pathlib
- How to write a text file with
pathlib
- How to copy files with
pathlib
- How to delete a file with
pathlib
- How to delete all files in a directory with
pathlib
- How to rename a file using
pathlib
- How to get the parent directory of a file with
pathlib
- Conclusion
What is pathlib
in Python?
pathlib
is a Python module created to make it easier to work with paths in a file system. This module debuted in Python 3.4 and was proposed by the PEP 428.
Prior to Python 3.4, the os
module from the standard library was the go to module to handle paths. os
provides several functions that manipulate paths represented as plain Python strings. For example, to join two paths using os
, one can use theos.path.join
function.
>>> import os
>>> os.path.join('/home/user', 'projects')
'/home/user/projects'
>>> os.path.expanduser('~')
'C:\\Users\\Miguel'
>>> home = os.path.expanduser('~')
>>> os.path.join(home, 'projects')
'C:\\Users\\Miguel\\projects'
Representing paths as strings encourages inexperienced Python developers to perform common path operations using string method. For example, joining paths with +
instead of using os.path.join()
, which can lead to subtle bugs and make the code hard to reuse across multiple platforms.
Moreover, if you want the path operations to be platform agnostic, you will need multiple calls to various os
functions such as os.path.dirname()
, os.path.basename()
, and others.
In an attempt to fix these issues, Python 3.4 incorporated the pathlib
module. It provides a high-level abstraction that works well under POSIX systems, such as Linux as well as Windows. It abstracts way the path’s representation and provides the operations as methods.
The anatomy of a pathlib.Path
To make it easier to understand the basics components of a Path
, in this section we’ll their basic components.
>>> from pathlib import Path
>>> path = Path('/home/miguel/projects/blog/config.tar.gz')
>>> path.drive
'/'
>>> path.root
'/'
>>> path.anchor
'/'
>>> path.parent
PosixPath('/home/miguel/projects/blog')
>>> path.name
'config.tar.gz'
>>> path.stem
'config.tar'
>>> path.suffix
'.gz'
>>> path.suffixes
['.tar', '.gz']
>>> from pathlib import Path
>>> path = Path(r'C:/Users/Miguel/projects/blog/config.tar.gz')
>>> path.drive
'C:'
>>> path.root
'/'
>>> path.anchor
'C:/'
>>> path.parent
WindowsPath('C:/Users/Miguel/projects/blog')
>>> path.name
'config.tar.gz'
>>> path.stem
'config.tar'
>>> path.suffix
'.gz'
>>> path.suffixes
['.tar', '.gz']
How to convert a path to string
pathlib
implements the magic __str__
method, and we can use it convert a path to string. Having this method implemented means you can get its string representation by passing it to the str
constructor, like in the example below.
>>> from pathlib import Path
>>> path = Path('/home/miguel/projects/tutorial')
>>> str(path)
'/home/miguel/projects/tutorial'
>>> repr(path)
"PosixPath('/home/miguel/projects/blog/config.tar.gz')"
The example above illustrates a PosixPath
, but you can also convert a WindowsPath to string using the same mechanism.
>>> from pathlib import Path
>>> path = Path(r'C:/Users/Miguel/projects/blog/config.tar.gz')
# when we convert a WindowsPath to string, Python adds backslashes
>>> str(path)
'C:\\Users\\Miguel\\projects\\blog\\config.tar.gz'
# whereas repr returns the path with forward slashes as it is represented on Windows
>>> repr(path)
"WindowsPath('C:/Users/Miguel/projects/blog/config.tar.gz')"
How to join a path by adding parts or other paths
One of the things I like the most about pathlib
is how easy it is to join two or more paths, or parts. There are three main ways you can do that:
- you can pass all the individual parts of a path to the constructor
- use the
.joinpath
method - use the
/
operator
>>> from pathlib import Path
# pass all the parts to the constructor
>>> Path('.', 'projects', 'python', 'source')
PosixPath('projects/python/source')
# Using the / operator to join another path object
>>> Path('.', 'projects', 'python') / Path('source')
PosixPath('projects/python/source')
# Using the / operator to join another a string
>>> Path('.', 'projects', 'python') / 'source'
PosixPath('projects/python/source')
# Using the joinpath method
>>> Path('.', 'projects', 'python').joinpath('source')
PosixPath('projects/python/source')
On Windows, Path
returns a WindowsPath
instead, but it works the same way as in Linux.
>>> Path('.', 'projects', 'python', 'source')
WindowsPath('projects/python/source')
>>> Path('.', 'projects', 'python') / Path('source')
WindowsPath('projects/python/source')
>>> Path('.', 'projects', 'python') / 'source'
WindowsPath('projects/python/source')
>>> Path('.', 'projects', 'python').joinpath('source')
WindowsPath('projects/python/source')
Working with directories using pathlib
In this section, we’ll see how we can traverse, or walk, through directories with pathlib
. And when it comes to navigating folders, there many things we can do, such as:
- getting the current working directory
- getting the home directory
- expanding the home directory
- creating new directories, doing it recursively, and dealing with issues when they already exist
- how create nested directories
- listing all files and folders in a directory
- listing only folders in a directory
- listing only the files in a directory
- getting the number of files in a directory
- listing all subdirectories recursively
- listing all files in a directory and subdirectories recursively
- recursively listing all files with a given extension or pattern
- changing current working directories
- removing an empty directory
- removing a directory along with its contents
How to get the current working directory (cwd) with pathlib
The pathlib
module provides a classmethod Path.cwd()
to get the current working directory in Python. It returns a PosixPath instance on Linux, or other Unix systems such as macOS or OpenBSD. Under the hood, Path.cwd()
is just a wrapper for the classic os.getcwd()
.
>>> from pathlib import Path
>>> Path.cwd()
PosixPath('/home/miguel/Desktop/pathlib')
On Windows, it returns a WindowsPath.
>>> from pathlib import Path
>>> Path.cwd()
>>> WindowsPath('C:/Users/Miguel/pathlib')
You can also print it by converting it to string using a f-string, for example.
>>> from pathlib import Path
>>> print(f'This is the current directory: {Path.cwd()}')
This is the current directory: /home/miguel/Desktop/pathlib
PS: If you
How to get the home directory with pathlib
When pathlib
arrived in Python 3.4, a Path
had no method for navigating to the home directory. This changed on Python 3.5, with the inclusion of the Path.home()
method.
In Python 3.4, one has to use os.path.expanduser
, which is awkward and unintuitive.
# In python 3.4
>>> import pathlib, os
>>> pathlib.Path(os.path.expanduser("~"))
PosixPath('/home/miguel')
From Python 3.5 onwards, you just call Path.home()
.
# In Python 3.5+
>>> import pathlib
>>> pathlib.Path.home()
PosixPath('/home/miguel')
Path.home()
also works well on Windows.
>>> import pathlib
>>> pathlib.Path.home()
WindowsPath('C:/Users/Miguel')
How to expand the initial path component with Path.expanduser()
In Unix systems, the home directory can be expanded using ~
( tilde symbol). For example, this allows us to represent full paths like this: /home/miguel/Desktop
as just: ~/Desktop/
.
>>> from pathlib import Path
>>> path = Path('~/Desktop/')
>>> path.expanduser()
PosixPath('/home/miguel/Desktop')
Despite being more popular on Unix systems, this representation also works on Windows.
>>> path = Path('~/projects')
>>> path.expanduser()
WindowsPath('C:/Users/Miguel/projects')
>>> path.expanduser().exists()
True
What’s the opposite of
os.path.expanduser()
?
Unfortunately, the pathlib
module doesn’t have any method to do the inverse operation. If you want to condense the expanded path back to its shorter version, you need to get the path relative to your home directory using Path.relative_to
, and place the ~
in front of it.
>>> from pathlib import Path
>>> path = Path('~/Desktop/')
>>> expanded_path = path.expanduser()
>>> expanded_path
PosixPath('/home/miguel/Desktop')
>>> '~' / expanded_path.relative_to(Path.home())
PosixPath('~/Desktop')
Creating directories with pathlib
A directory is nothing more than a location for storing files and other directories, also called folders. pathlib.Path
comes with a method to create new directories named Path.mkdir()
.
This method takes three arguments:
mode
: Used to determine the file mode and access flagsparents
: Similar to themkdir -p
command in Unix systems. Default toFalse
which means it raises errors if there’s the parent is missing, or if the directory is already created. When it’sTrue
,pathlib.mkdir
creates the missing parent directories.exist_ok
: Defaults toFalse
and raisesFileExistsError
if the directory being created already exists. When you set it toTrue
,pathlib
ignores the error if the last part of the path is not an existing non-directory file.
>>> from pathlib import Path
# lists all files and directories in the current folder
>>> list(Path.cwd().iterdir())
[PosixPath('/home/miguel/path/not_created_yet'),
PosixPath('/home/miguel/path/reports')]
# create a new path instance
>>> path = Path('new_directory')
# only the path instance has been created, but it doesn't exist on disk yet
>>> path.exists()
False
# create path on disk
>>> path.mkdir()
# now it exsists
>>> path.exists()
True
# indeed, it shows up
>>> list(Path.cwd().iterdir())
[PosixPath('/home/miguel/path/not_created_yet'),
PosixPath('/home/miguel/path/reports'),
PosixPath('/home/miguel/path/new_directory')]
Creating a directory that already exists
When you have a directory path and it already exists, Python raises FileExistsError
if you call Path.mkdir()
on it. In the previous section, we briefly mentioned that this happens because by default the exist_ok
argument is set to False
.
>>> from pathlib import Path
>>> list(Path.cwd().iterdir())
[PosixPath('/home/miguel/path/not_created_yet'),
PosixPath('/home/miguel/path/reports'),
PosixPath('/home/miguel/path/new_directory')]
>>> path = Path('new_directory')
>>> path.exists()
True
>>> path.mkdir()
---------------------------------------------------------------------------
FileExistsError Traceback (most recent call last)
<ipython-input-25-4b7d1fa6f6eb> in <module>
----> 1 path.mkdir()
~/.pyenv/versions/3.9.4/lib/python3.9/pathlib.py in mkdir(self, mode, parents, exist_ok)
1311 try:
-> 1312 self._accessor.mkdir(self, mode)
1313 except FileNotFoundError:
1314 if not parents or self.parent == self:
FileExistsError: [Errno 17] File exists: 'new_directory'
To create a folder that already exists, you need to set exist_ok
to True
. This is useful if you don’t want to check using if
‘s or deal with exceptions, for example. Another benefit is that is the directory is not empty, pathlib
won’t override it.
>>> path = Path('new_directory')
>>> path.exists()
True
>>> path.mkdir(exist_ok=True)
>>> list(Path.cwd().iterdir())
[PosixPath('/home/miguel/path/not_created_yet'),
PosixPath('/home/miguel/path/reports'),
PosixPath('/home/miguel/path/new_directory')]
>>> (path / 'new_file.txt').touch()
>>> list(path.iterdir())
[PosixPath('new_directory/new_file.txt')]
>>> path.mkdir(exist_ok=True)
# the file is still there, pathlib didn't overwrote it
>>> list(path.iterdir())
[PosixPath('new_directory/new_file.txt')]
How to create parent directories recursively if not exists
Sometimes you might want to create not only a single directory but also a parent and a subdirectory in one go.
The good news is that Path.mkdir()
can handle situations like this well thanks to its parents
argument. When parents
is set to True
, pathlib.mkdir
creates the missing parent directories; this behavior is similar to the mkdir -p
command in Unix systems.
>>> from pathlib import Path
>>> path = Path('new_parent_dir/sub_dir')
>>> path.mkdir()
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
<ipython-input-35-4b7d1fa6f6eb> in <module>
----> 1 path.mkdir()
~/.pyenv/versions/3.9.4/lib/python3.9/pathlib.py in mkdir(self, mode, parents, exist_ok)
1311 try:
-> 1312 self._accessor.mkdir(self, mode)
1313 except FileNotFoundError:
1314 if not parents or self.parent == self:
FileNotFoundError: [Errno 2] No such file or directory: 'new_parent_dir/sub_dir'
>>> path.mkdir(parents=True)
>>> path.exists()
True
>>> path.parent
PosixPath('new_parent_dir')
>>> path
PosixPath('new_parent_dir/sub_dir')
How to list all files and directories
There are many ways you can list files in a directory with Python’s pathlib
. We’ll see each one in this section.
To list all files in a directory, including other directories, you can use the Path.iterdir()
method. For performance reasons, it returns a generator that you can either use to iterate over it, or just convert to a list for convenience.
>>> from pathlib import Path
>>> path = Path('/home/miguel/projects/pathlib')
>>> list(path.iterdir())
[PosixPath('/home/miguel/projects/pathlib/script.py'),
PosixPath('/home/miguel/projects/pathlib/README.md'),
PosixPath('/home/miguel/projects/pathlib/tests'),
PosixPath('/home/miguel/projects/pathlib/src')]
Using isdir
to list only the directories
We’ve seen that iterdir
returns a list of Path
s. To list only the directories in a folder, you can use the Path.is_dir()
method. The example below will get all the folder names inside the directory.
⚠️ WARNING: This example only lists the immediate subdirectories in Python. In the next subsection, we’ll see how to list all subdirectories.
>>> from pathlib import Path
>>> path = Path('/home/miguel/projects/pathlib')
>>> [p for p in path.iterdir() if p.is_dir()]
[PosixPath('/home/miguel/projects/pathlib/tests'),
PosixPath('/home/miguel/projects/pathlib/src')]
Getting a list of all subdirectories in the current directory recursively
In this section, we’ll see how to navigate in directory and subdirectories. This time we’ll use another method from pathlib.Path
named glob
.
>>> from pathlib import Path
>>> path = Path('/home/miguel/projects/pathlib')
>>> [p for p in path.glob('**/*') if p.is_dir()]
[PosixPath('/home/miguel/projects/pathlib/tests'),
PosixPath('/home/miguel/projects/pathlib/src'),
PosixPath('/home/miguel/projects/pathlib/src/dir')]
As you see, Path.glob
will also print the subdirectory src/dir
.
Remembering to pass '**/
to glob()
is a bit annoying, but there’s a way to simplify this by using Path.rglob()
.
>>> from pathlib import Path
>>> path = Path('/home/miguel/projects/pathlib')
>>> [p for p in path.rglob('*') if p.is_dir()]
[PosixPath('/home/miguel/projects/pathlib/tests'),
PosixPath('/home/miguel/projects/pathlib/src'),
PosixPath('/home/miguel/projects/pathlib/src/dir')]
How to list only the files with is_file
Just as pathlib
provides a method to check if a path is a directory, it also provides one to check if a path is a file. This method is called Path.is_file()
, and you can use to filter out the directories and print all file names in a folder.
>>> from pathlib import Path
>>> path = Path('/home/miguel/projects/pathlib')
>>> [p for p in path.iterdir() if p.is_file()]
[PosixPath('/home/miguel/projects/pathlib/script.py'),
PosixPath('/home/miguel/projects/pathlib/README.md')]
⚠️ WARNING: This example only lists the files inside the current directory. In the next subsection, we’ll see how to list all files inside the subdirectories as well.
Another nice use case is using Path.iterdir()
to count the number of files inside a folder.
>>> from pathlib import Path
>>> path = Path('/home/miguel/projects/pathlib')
>>> len([p for p in path.iterdir() if p.is_file()])
2
How to recursively iterate through all files
In previous sections, we used Path.rglob()
to list all directories recursively, we can do the same for files by filtering the paths using the Path.is_file()
method.
>>> from pathlib import Path
>>> path = Path('/home/miguel/projects/pathlib')
>>> [p for p in path.rglob('*') if p.is_file()]
[PosixPath('/home/miguel/projects/pathlib/script.py'),
PosixPath('/home/miguel/projects/pathlib/README.md'),
PosixPath('/home/miguel/projects/pathlib/tests/test_script.py'),
PosixPath('/home/miguel/projects/pathlib/src/dir/walk.py')]
How to recursively list all files with a given extension or pattern
In the previous example, we list all files in a directory, but what if we want to filter by extension? For that, pathlib.Path
has a method named match()
, which returns True
if matching is successful, and False
otherwise.
In the example below, we list all .py
files recursively.
>>> from pathlib import Path
>>> path = Path('/home/miguel/projects/pathlib')
>>> [p for p in path.rglob('*') if p.is_file() and p.match('*.py')]
[PosixPath('/home/miguel/projects/pathlib/script.py'),
PosixPath('/home/miguel/projects/pathlib/tests/test_script.py'),
PosixPath('/home/miguel/projects/pathlib/src/dir/walk.py')]
We can use the same trick for other kinds of files. For example, we might want to list all images in a directory or subdirectories.
>>> from pathlib import Path
>>> path = Path('/home/miguel/pictures')
>>> [p for p in path.rglob('*')
if p.match('*.jpeg') or p.match('*.jpg') or p.match('*.png')
]
[PosixPath('/home/miguel/pictures/dog.png'),
PosixPath('/home/miguel/pictures/london/sunshine.jpg'),
PosixPath('/home/miguel/pictures/london/building.jpeg')]
We can actually simplify it even further, we can use only Path.glob
and Path.rglob
to matching. (Thanks to u/laundmo
and u/SquareRootsi
for pointing out!)
>>> from pathlib import Path
>>> path = Path('/home/miguel/projects/pathlib')
>>> list(path.rglob('*.py'))
[PosixPath('/home/miguel/projects/pathlib/script.py'),
PosixPath('/home/miguel/projects/pathlib/tests/test_script.py'),
PosixPath('/home/miguel/projects/pathlib/src/dir/walk.py')]
>>> list(path.glob('*.py'))
[PosixPath('/home/miguel/projects/pathlib/script.py')]
>>> list(path.glob('**/*.py'))
[PosixPath('/home/miguel/projects/pathlib/script.py'),
PosixPath('/home/miguel/projects/pathlib/tests/test_script.py'),
PosixPath('/home/miguel/projects/pathlib/src/dir/walk.py')]
How to change directories with Python pathlib
Unfortunately, pathlib
has no built-in method to change directories. However, it is possible to combine it with the os.chdir()
function, and use it to change the current directory to a different one.
⚠️ WARNING: For versions prior to 3.6,
os.chdir
only accepts paths as string.
>>> import pathlib
>>> pathlib.Path.cwd()
PosixPath('/home/miguel')
>>> target_dir = '/home'
>>> os.chdir(target_dir)
>>> pathlib.Path.cwd()
PosixPath('/home')
How to delete directories with pathlib
Deleting directories using pathlib
depends on if the folder is empty or not. To delete an empty directory, we can use the Path.rmdir()
method.
>>> from pathlib import Path
>>> path = Path('new_empty_dir')
>>> path.mkdir()
>>> path.exists()
True
>>> path.rmdir()
>>> path.exists()
False
If we put some file or other directory inside and try to delete, Path.rmdir()
raises an error.
>>> from pathlib import Path
>>> path = Path('non_empty_dir')
>>> path.mkdir()
>>> (path / 'file.txt').touch()
>>> path
PosixPath('non_empty_dir')
>>> path.exists()
True
>>> path.rmdir()
---------------------------------------------------------------------------
OSError Traceback (most recent call last)
<ipython-input-64-00bf20b27a59> in <module>
----> 1 path.rmdir()
~/.pyenv/versions/3.9.4/lib/python3.9/pathlib.py in rmdir(self)
1350 Remove this directory. The directory must be empty.
...
-> 1352 self._accessor.rmdir(self)
1353
1354 def lstat(self):
OSError: [Errno 39] Directory not empty: 'non_empty_dir'
Now, the question is: how to delete non-empty directories with pathlib
?
This is what we’ll see next.
How to remove a directory along with its contents with pathlib
To delete a non-empty directory, we need to remove its contents, everything.
To do that with pathlib
, we need to create a function that uses Path.iterdir()
to walk or traverse the directory and:
- if the path is a file, we call
Path.unlink()
- otherwise, we call the function recursively. When there are no more files, that is, when the folder is empty, just call
Path.rmdir()
Let’s use the following example of a non empty directory with nested folder and files in it.
$ tree /home/miguel/Desktop/blog/pathlib/sandbox/
/home/miguel/Desktop/blog/pathlib/sandbox/
├── article.txt
└── reports
├── another_nested
│ └── some_file.png
└── article.txt
2 directories, 3 files
To remove it we can use the following recursive function.
>>> from pathlib import Path
>>> def remove_all(root: Path):
for path in root.iterdir():
if path.is_file():
print(f'Deleting the file: {path}')
path.unlink()
else:
remove_all(path)
print(f'Deleting the empty dir: {root}')
root.rmdir()
Then, we invoke it for the root directory, inclusive.
>>> from pathlib import Path
>>> root = Path('/home/miguel/Desktop/blog/pathlib/sandbox')
>>> root
PosixPath('/home/miguel/Desktop/blog/pathlib/sandbox')
>>> root.exists()
True
>>> remove_all(root)
Deleting the file: /home/miguel/Desktop/blog/pathlib/sandbox/reports/another_nested/some_file.png
Deleting the empty dir: /home/miguel/Desktop/blog/pathlib/sandbox/reports/another_nested
Deleting the file: /home/miguel/Desktop/blog/pathlib/sandbox/reports/article.txt
Deleting the empty dir: /home/miguel/Desktop/blog/pathlib/sandbox/reports
Deleting the file: /home/miguel/Desktop/blog/pathlib/sandbox/article.txt
Deleting the empty dir: /home/miguel/Desktop/blog/pathlib/sandbox
>>> root
PosixPath('/home/miguel/Desktop/blog/pathlib/sandbox')
>>> root.exists()
False
I need to be honest, this solution works fine but it’s not the most appropriate one. pathlib
is not suitable for these kind of operations.
As suggested by u/Rawing7
from reddit, a better approach is to use shutil.rmtree
.
>>> from pathlib import Path
>>> import shutil
>>> root = Path('/home/miguel/Desktop/blog/pathlib/sandbox')
>>> root.exists()
True
>>> shutil.rmtree(root)
>>> root.exists()
False
Working with files
In this section, we’ll use pathlib
to perform operations on a file, for example, we’ll see how we can:
- create new files
- copy existing files
- delete files with
pathlib
- read and write files with
pathlib
Specifically, we’ll learn how to:
- create (touch) an empty file
- touch a file with timestamp
- touch a new file and create the parent directories if they don’t exist
- get the file name
- get the file extension from a filename
- open a file for reading
- read a text file
- read a JSON file
- read a binary file
- opening all the files in a folder
- write a text file
- write a JSON file
- write bytes data file
- copy an existing file to another directory
- delete a single file
- delete all files in a directory
- rename a file by changing its name, or by adding a new extension
- get the parent directory of a file or script
How to touch (create an empty) a file
pathlib
provides a method to create an empty file named Path.touch()
. This method is very handy when you need to create a placeholder file if it does not exist.
>>> from pathlib import Path
>>> Path('empty.txt').exists()
False
>>> Path('empty.txt').touch()
>>> Path('empty.txt').exists()
True
Touch a file with timestamp
To create a timestamped empty file, we first need to determine the timestamp format.
One way to do that is to use the time
and datetime
. First we define a date format, then we use the datetime
module to create the datetime object. Then, we use the time.mktime
to get back the timestamp.
Once we have the timestamp, we can just use f-strings to build the filename.
>>> import time, datetime
>>> s = '02/03/2021'
>>> d = datetime.datetime.strptime(s, "%d/%m/%Y")
>>> d
datetime.datetime(2021, 3, 2, 0, 0)
>>> d.timetuple()
time.struct_time(tm_year=2021, tm_mon=3, tm_mday=2, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=1, tm_yday=61, tm_isdst=-1)
>>> time.mktime(d.timetuple())
1614643200.0
>>> int(time.mktime(d.timetuple()))
1614643200
>>> from pathlib import Path
>>> Path(f'empty_{int(time.mktime(d.timetuple()))}.txt').exists()
False
>>> Path(f'empty_{int(time.mktime(d.timetuple()))}.txt').touch()
>>> Path(f'empty_{int(time.mktime(d.timetuple()))}.txt').exists()
True
>>> str(Path(f'empty_{int(time.mktime(d.timetuple()))}.txt'))
'empty_1614643200.txt'
How to touch a file and create parent directories
Another common problem when creating empty files is to place them in a directory that doesn’t exist yet. The reason is that path.touch()
only works if the directory exists. To illustrate that, let’s see an example.
>>> from pathlib import Path
>>> Path('path/not_created_yet/empty.txt')
PosixPath('path/not_created_yet/empty.txt')
>>> Path('path/not_created_yet/empty.txt').exists()
False
>>> Path('path/not_created_yet/empty.txt').touch()
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
<ipython-input-24-177d43b041e9> in <module>
----> 1 Path('path/not_created_yet/empty.txt').touch()
~/.pyenv/versions/3.9.4/lib/python3.9/pathlib.py in touch(self, mode, exist_ok)
1302 if not exist_ok:
1303 flags |= os.O_EXCL
-> 1304 fd = self._raw_open(flags, mode)
1305 os.close(fd)
1306
~/.pyenv/versions/3.9.4/lib/python3.9/pathlib.py in _raw_open(self, flags, mode)
1114 as os.open() does.
...
-> 1116 return self._accessor.open(self, flags, mode)
1117
1118 # Public API
FileNotFoundError: [Errno 2] No such file or directory: 'path/not_created_yet/empty.txt'
If the target directory does not exist, pathlib
raises FileNotFoundError
. To fix that we need to create the directory first, the simplest way, as described in the «creating directories» section, is to use the Path.mkdir(parents=True, exist_ok=True)
. This method creates an empty directory including all parent directories.
>>> from pathlib import Path
>>> Path('path/not_created_yet/empty.txt').exists()
False
# let's create the empty folder first
>>> folder = Path('path/not_created_yet/')
# it doesn't exist yet
>>> folder.exists()
False
# create it
>>> folder.mkdir(parents=True, exist_ok=True)
>>> folder.exists()
True
# the folder exists, but we still need to create the empty file
>>> Path('path/not_created_yet/empty.txt').exists()
False
# create it as usual using pathlib touch
>>> Path('path/not_created_yet/empty.txt').touch()
# verify it exists
>>> Path('path/not_created_yet/empty.txt').exists()
True
How to get the filename from path
A Path
comes with not only method but also properties. One of them is the Path.name
, which as the name implies, returns the filename of the path. This property ignores the parent directories, and return only the file name including the extension.
>>> from pathlib import Path
>>> picture = Path('/home/miguel/Desktop/profile.png')
>>> picture.name
'profile.png'
How to get the filename without the extension
Sometimes, you might need to retrieve the file name without the extension. A natural way of doing this would be splitting the string on the dot. However, pathlib.Path
comes with another helper property named Path.stem
, which returns the final component of the path, without the extension.
>>> from pathlib import Path
>>> picture = Path('/home/miguel/Desktop/profile.png')
>>> picture.stem
'profile'
How to get the file extension from a filename using pathlib
If the Path.stem
property returns the filename excluding the extension, how can we do the opposite? How to retrieve only the extension?
We can do that using the Path.suffix
property.
>>> from pathlib import Path
>>> picture = Path('/home/miguel/Desktop/profile.png')
>>> picture.suffix
'.png'
Some files, such as .tar.gz
has two parts as extension, and Path.suffix
will return only the last part. To get the whole extension, you need the property Path.suffixes
.
This property returns a list of all suffixes for that path. We can then use it to join the list into a single string.
>>> backup = Path('/home/miguel/Desktop/photos.tar.gz')
>>> backup.suffix
'.gz'
>>> backup.suffixes
['.tar', '.gz']
>>> ''.join(backup.suffixes)
'.tar.gz'
How to open a file for reading with pathlib
Another great feature from pathlib
is the ability to open a file pointed to by the path. The behavior is similar to the built-in open()
function. In fact, it accepts pretty much the same parameters.
>>> from pathlib import Path
>>> p = Path('/home/miguel/Desktop/blog/pathlib/recipe.txt')
# open the file
>>> f = p.open()
# read it
>>> lines = f.readlines()
>>> print(lines)
['1. Boil water. \n', '2. Warm up teapot. ...\n', '3. Put tea into teapot and add hot water.\n', '4. Cover teapot and steep tea for 5 minutes.\n', '5. Strain tea solids and pour hot tea into tea cups.\n']
# then make sure to close the file descriptor
>>> f.close()
# or use a context manager, and read the file in one go
>>> with p.open() as f:
lines = f.readlines()
>>> print(lines)
['1. Boil water. \n', '2. Warm up teapot. ...\n', '3. Put tea into teapot and add hot water.\n', '4. Cover teapot and steep tea for 5 minutes.\n', '5. Strain tea solids and pour hot tea into tea cups.\n']
# you can also read the whole content as string
>>> with p.open() as f:
content = f.read()
>>> print(content)
1. Boil water.
2. Warm up teapot. ...
3. Put tea into teapot and add hot water.
4. Cover teapot and steep tea for 5 minutes.
5. Strain tea solids and pour hot tea into tea cups.
How to read text files with pathlib
In the previous section, we used the Path.open()
method and file.read()
function to read the contents of the text file as a string. Even though it works just fine, you still need to close the file or using the with
keyword to close it automatically.
pathlib
comes with a .read_text()
method that does that for you, which is much more convenient.
>>> from pathlib import Path
# just call '.read_text()', no need to close the file
>>> content = p.read_text()
>>> print(content)
1. Boil water.
2. Warm up teapot. ...
3. Put tea into teapot and add hot water.
4. Cover teapot and steep tea for 5 minutes.
5. Strain tea solids and pour hot tea into tea cups.
The file is opened and then closed. The optional parameters have the same meaning as in open(). pathlib docs
How to read JSON files from path with pathlib
A JSON file a nothing more than a text file structured according to the JSON specification. To read a JSON, we can open the path for reading—as we do for text files—and use json.loads()
function from the the json
module.
>>> import json
>>> from pathlib import Path
>>> response = Path('./jsons/response.json')
>>> with response.open() as f:
resp = json.load(f)
>>> resp
{'name': 'remi', 'age': 28}
How to read binary files with pathlib
At this point, if you know how to read a text file, then you reading binary files will be easy. We can do this two ways:
- with the
Path.open()
method passing the flagsrb
- with the
Path.read_bytes()
method
Let’s start with the first method.
>>> from pathlib import Path
>>> picture = Path('/home/miguel/Desktop/profile.png')
# open the file
>>> f = picture.open()
# read it
>>> image_bytes = f.read()
>>> print(image_bytes)
b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x01R\x00\x00\x01p\x08\x02\x00\x00\x00e\xd3d\x85\x00\x00\x00\x03sBIT\x08\x08\x08\xdb\xe1O\xe0\x00\x00\x00\x10tEXtSoftware\x00Shutterc\x82\xd0\t\x00\x00 \x00IDATx\xda\xd4\xbdkw\x1cY\x92\x1ch\xe6~#2\x13\xe0\xa3\xaa\xbbg
... [OMITTED] ....
0e\xe5\x88\xfc\x7fa\x1a\xc2p\x17\xf0N\xad\x00\x00\x00\x00IEND\xaeB`\x82'
# then make sure to close the file descriptor
>>> f.close()
# or use a context manager, and read the file in one go
>>> with p.open('rb') as f:
image_bytes = f.read()
>>> print(image_bytes)
b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x01R\x00\x00\x01p\x08\x02\x00\x00\x00e\xd3d\x85\x00\x00\x00\x03sBIT\x08\x08\x08\xdb\xe1O\xe0\x00\x00\x00\x10tEXtSoftware\x00Shutterc\x82\xd0\t\x00\x00 \x00IDATx\xda\xd4\xbdkw\x1cY\x92\x1ch\xe6~#2\x13\xe0\xa3\xaa\xbbg
... [OMITTED] ....
0e\xe5\x88\xfc\x7fa\x1a\xc2p\x17\xf0N\xad\x00\x00\x00\x00IEND\xaeB`\x82'
And just like Path.read_text()
, pathlib
comes with a .read_bytes()
method that can open and close the file for you.
>>> from pathlib import Path
# just call '.read_bytes()', no need to close the file
>>> picture = Path('/home/miguel/Desktop/profile.png')
>>> picture.read_bytes()
b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x01R\x00\x00\x01p\x08\x02\x00\x00\x00e\xd3d\x85\x00\x00\x00\x03sBIT\x08\x08\x08\xdb\xe1O\xe0\x00\x00\x00\x10tEXtSoftware\x00Shutterc\x82\xd0\t\x00\x00 \x00IDATx\xda\xd4\xbdkw\x1cY\x92\x1ch\xe6~#2\x13\xe0\xa3\xaa\xbbg
... [OMITTED] ....
0e\xe5\x88\xfc\x7fa\x1a\xc2p\x17\xf0N\xad\x00\x00\x00\x00IEND\xaeB`\x82'
How to open all files in a directory in Python
Let’s image you need a Python script to search all files in a directory and open them all. Maybe you want to filter by extension, or you want to do it recursively. If you’ve been following this guide from the beginning, you now know how to use the Path.iterdir()
method.
To open all files in a directory, we can combine Path.iterdir()
with Path.is_file()
.
>>> import pathlib
>>> for i in range(2):
print(i)
# we can use iterdir to traverse all paths in a directory
>>> for path in pathlib.Path("my_images").iterdir():
# if the path is a file, then we open it
if path.is_file():
with path.open(path, "rb") as f:
image_bytes = f.read()
load_image_from_bytes(image_bytes)
If you need to do it recursively, we can use Path.rglob()
instead of Path.iterdir()
.
>>> import pathlib
# we can use rglob to walk nested directories
>>> for path in pathlib.Path("my_images").rglob('*'):
# if the path is a file, then we open it
if path.is_file():
with path.open(path, "rb") as f:
image_bytes = f.read()
load_image_from_bytes(image_bytes)
How to write a text file with pathlib
In previous sections, we saw how to read text files using Path.read_text()
.
To write a text file to disk, pathlib
comes with a Path.write_text()
. The benefits of using this method is that it writes the data and close the file for you, and the optional parameters have the same meaning as in open().
⚠️ WARNING: If you open an existing file,
Path.write_text()
will overwrite it.
>>> import pathlib
>>> file_path = pathlib.Path('/home/miguel/Desktop/blog/recipe.txt')
>>> recipe_txt = '''
1. Boil water.
2. Warm up teapot. ...
3. Put tea into teapot and add hot water.
4. Cover teapot and steep tea for 5 minutes.
5. Strain tea solids and pour hot tea into tea cups.
'''
>>> file_path.exists()
False
>>> file_path.write_text(recipe_txt)
180
>>> content = file_path.read_text()
>>> print(content)
1. Boil water.
2. Warm up teapot. ...
3. Put tea into teapot and add hot water.
4. Cover teapot and steep tea for 5 minutes.
5. Strain tea solids and pour hot tea into tea cups.
How to write JSON files to path with pathlib
Python represents JSON objects as plain dictionaries, to write them to a file as JSON using pathlib
, we need to combine the json.dump
function and Path.open()
, the same way we did to read a JSON from disk.
>>> import json
>>> import pathlib
>>> resp = {'name': 'remi', 'age': 28}
>>> response = pathlib.Path('./response.json')
>>> response.exists()
False
>>> with response.open('w') as f:
json.dump(resp, f)
>>> response.read_text()
'{"name": "remi", "age": 28}'
How to write bytes data to a file
To write bytes to a file, we can use either Path.open()
method passing the flags wb
or Path.write_bytes()
method.
>>> from pathlib import Path
>>> image_path_1 = Path('./profile.png')
>>> image_bytes = b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00 [OMITTED] \x00I
END\xaeB`\x82'
>>> with image_path_1.open('wb') as f:
f.write(image_bytes)
>>> image_path_1.read_bytes()
b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00 [OMITTED] \x00IEND\xaeB`\x82'
>>> image_path_2 = Path('./profile_2.png')
>>> image_path_2.exists()
False
>>> image_path_2.write_bytes(image_bytes)
37
>>> image_path_2.read_bytes()
b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00 [OMITTED] \x00IEND\xaeB`\x82'
How to copy files with pathlib
pathlib
cannot copy files. However, if we have a file represented by a path that doesn’t mean we can’t copy it. There are two different ways of doing that:
- using the
shutil
module - using the
Path.read_bytes()
andPath.write_bytes()
methods
For the first alternative, we use the shutil.copyfile(src, dst)
function and pass the source and destination path.
>>> import pathlib, shutil
>>> src = Path('/home/miguel/Desktop/blog/pathlib/sandbox/article.txt')
>>> src.exists()
True
>>> dst = Path('/home/miguel/Desktop/blog/pathlib/sandbox/reports/article.txt')
>>> dst.exists()
>>> False
>>> shutil.copyfile(src, dst)
PosixPath('/home/miguel/Desktop/blog/pathlib/sandbox/reports/article.txt')
>>> dst.exists()
True
>>> dst.read_text()
'This is \n\nan \n\ninteresting article.\n'
>>> dst.read_text() == src.read_text()
True
⚠️ WARNING:
shutil
prior to Python 3.6 cannot handlePath
instances. You need to convert the path to string first.
The second method involves copying the whole file, then writing it to another destination.
>>> import pathlib, shutil
>>> src = Path('/home/miguel/Desktop/blog/pathlib/sandbox/article.txt')
>>> src.exists()
True
>>> dst = Path('/home/miguel/Desktop/blog/pathlib/sandbox/reports/article.txt')
>>> dst.exists()
False
>>> dst.write_bytes(src.read_bytes())
36
>>> dst.exists()
True
>>> dst.read_text()
'This is \n\nan \n\ninteresting article.\n'
>>> dst.read_text() == src.read_text()
True
⚠️ WARNING: This method will overwrite the destination path. If that’s a concern, it’s advisable either to check if the file exists first, or to open the file in writing mode using the
x
flag. This flag will open the file exclusive creation, thus failing withFileExistsError
if the file already exists.
Another downside of this approach is that it loads the file to memory. If the file is big, prefer shutil.copyfileobj
. It supports buffering and can read the file in chunks, thus avoiding uncontrolled memory consumption.
>>> import pathlib, shutil
>>> src = Path('/home/miguel/Desktop/blog/pathlib/sandbox/article.txt')
>>> dst = Path('/home/miguel/Desktop/blog/pathlib/sandbox/reports/article.txt')
>>> if not dst.exists():
dst.write_bytes(src.read_bytes())
else:
print('File already exists, aborting...')
File already exists, aborting...
>>> with dst.open('xb') as f:
f.write(src.read_bytes())
---------------------------------------------------------------------------
FileExistsError Traceback (most recent call last)
<ipython-input-25-1974c5808b1a> in <module>
----> 1 with dst.open('xb') as f:
2 f.write(src.read_bytes())
3
How to delete a file with pathlib
You can remove a file or symbolic link with the Path.unlink()
method.
>>> from pathlib import Path
>>> Path('path/reports/report.csv').touch()
>>> path = Path('path/reports/report.csv')
>>> path.exists()
True
>>> path.unlink()
>>> path.exists()
False
As of Python 3.8, this method takes one argument named missing_ok
. By default, missing_ok
is set to False
, which means it will raise an FileNotFoundError
error if the file doesn’t exist.
>>> path = Path('path/reports/report.csv')
>>> path.exists()
False
>>> path.unlink()
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
<ipython-input-6-8eea53121d7f> in <module>
----> 1 path.unlink()
~/.pyenv/versions/3.9.4/lib/python3.9/pathlib.py in unlink(self, missing_ok)
1342 try:
-> 1343 self._accessor.unlink(self)
1344 except FileNotFoundError:
1345 if not missing_ok:
FileNotFoundError: [Errno 2] No such file or directory: 'path/reports/report.csv'
# when missing_ok is True, no error is raised
>>> path.unlink(missing_ok=True)
How to delete all files in a directory with pathlib
To remove all files in a folder, we need to traverse it and check if the path is a file, and if so, call Path.unlink()
on it as we saw in the previous section.
To walk over the contents of a directory, we can use Path.iterdir()
. Let’s consider the following directory.
$ tree /home/miguel/path/
/home/miguel/path/
├── jsons
│ └── response.json
├── new_parent_dir
│ └── sub_dir
├── non_empty_dir
│ └── file.txt
├── not_created_yet
│ └── empty.txt
├── number.csv
├── photo_1.png
├── report.md
└── reports
This method only deletes the immediate files under the current directory, so it is not recursive.
>>> import pathlib
>>> path = pathlib.Path('/home/miguel/path')
>>> list(path.iterdir())
Out[5]:
[PosixPath('/home/miguel/path/jsons'),
PosixPath('/home/miguel/path/non_empty_dir'),
PosixPath('/home/miguel/path/not_created_yet'),
PosixPath('/home/miguel/path/reports'),
PosixPath('/home/miguel/path/photo_1.png'),
PosixPath('/home/miguel/path/number.csv'),
PosixPath('/home/miguel/path/new_parent_dir'),
PosixPath('/home/miguel/path/report.md')]
>>> for p in path.iterdir():
if p.is_file():
p.unlink()
>>> list(path.iterdir())
[PosixPath('/home/miguel/path/jsons'),
PosixPath('/home/miguel/path/non_empty_dir'),
PosixPath('/home/miguel/path/not_created_yet'),
PosixPath('/home/miguel/path/reports'),
PosixPath('/home/miguel/path/new_parent_dir')]
How to rename a file using pathlib
pathlib
also comes with a method to rename files called Path.rename(target)
. It takes a target file path and renames the source to the target. As of Python 3.8, Path.rename()
returns the new Path instance.
>>> from pathlib import Path
>>> src_file = Path('recipe.txt')
>>> src_file.open('w').write('An delicious recipe')
19
>>> src_file.read_text()
'An delicious recipe'
>>> target = Path('new_recipe.txt')
>>> src_file.rename(target)
PosixPath('new_recipe.txt')
>>> src_file
PosixPath('recipe.txt')
>>> src_file.exists()
False
>>> target.read_text()
'An delicious recipe'
Renaming only file extension
If all you want is to change the file extension to something else, for example, change from .txt
to .md
, you can use Path.rename(target)
in conjunction with Path.with_suffix(suffix)
method, which does the following:
- appends a new suffix, if the original path doesn’t have one
- removes the suffix, if the supplied suffix is an empty string
Let’s see an example where we change our recipe file from plain text .txt
to markdown .md
.
>>> from pathlib import Path
>>> src_file = Path('recipe.txt')
>>> src_file.open('w').write('An delicious recipe')
19
>>> new_src_file = src_file.rename(src_file.with_suffix('.md'))
>>> new_src_file
PosixPath('recipe.md')
>>> src_file.exists()
False
>>> new_src_file.exists()
True
>>> new_src_file.read_text()
'An delicious recipe'
>>> removed_extension_file = new_src_file.rename(src_file.with_suffix(''))
>>> removed_extension_file
PosixPath('recipe')
>>> removed_extension_file.read_text()
'An delicious recipe'
How to get the parent directory of a file with pathlib
Sometimes we want to get the name of the directory a file belongs to. You can get that through a Path
property named parent
. This property represents the logical parent of the path, which means it returns the parent of a file or directory.
>>> from pathlib import Path
>>> path = Path('path/reports/report.csv')
>>> path.exists()
False
>>> parent_dir = path.parent
>>> parent_dir
PosixPath('path/reports')
>>> parent_dir.parent
PosixPath('path')
Conclusion
That was a lot to learn, and I hope you enjoyed it just as I enjoyed writing it.
pathlib
has been part of the standard library since Python 3.4 and it’s a great solution when it comes to handling paths.
In this guide, we covered the most important use cases in which pathlib
shines through tons of examples.
I hope this cookbook is useful to you, and see you next time.
Other posts you may like:
-
Find the Current Working Directory in Python
-
The Best Ways to Compare Two Lists in Python
-
Python F-String: 73 Examples to Help You Master It
See you next time!
This article was originally published at https://miguendes.me
Взаимодействие с файловой системой#
Нередко требуется программными средствами взаимодействовать с файловой системой и в стандартной библиотеке python
реализовано много инструментов, значительно упрощающих этот процесс.
Путь к файлу/директории#
Путь (англ. path) — набор символов, показывающий расположение файла или каталога в файловой системе (источник — wikipedia). В программных средах путь необходим, например, для того, чтобы открывать и сохранять файлы. В большинстве случаев в python
путь представляется в виде обычного строкового объекта.
Обычно путь представляет собой последовательность вложенных каталогов, разделенных специальным символом, при этом разделитель каталогов может меняться в зависимости от операционной системы: в OS Windows
используется “\
”, в unix-like
системах — “/
”. Кроме того, важно знать, что пути бывают абсолютными и относительными. Абсолютный путь всегда начинается с корневого каталога файловой системы (в OS Windows
— это логический раздел (например, “C:”), в UNIX-like
системах — “/”) и всегда указывает на один и тот же файл (или директорию). Относительный путь, наоборот, не начинается с корневого каталога и указывает расположение относительно текущего рабочего каталога, а значит будет указывать на совершено другой файл, если поменять рабочий каталог.
Итого, например, путь к файлу “hello.py” в домашней директории пользователя “ivan” в зависимости от операционной системы будет выглядеть приблизительно следующим образом:
|
|
|
---|---|---|
Глобальный |
C:\Users\ivan\hello.py |
/home/users/ivan/hello.py |
Относительный |
.\hello.py |
./hello.py |
В связи с этим требуется прикладывать дополнительные усилия, чтобы заставить работать один и тот же код на машинах с разными операционными системами. Чтобы все же абстрагироваться от того, как конкретно устроена файловая система на каждой конкретной машине, в python
предусмотренны модули стандартной библиотеки os.path и pathlib.
Проблема с путями в стиле Windows
#
Как было отмечено выше, в Windows
в качестве разделителя используется символ обратного слеша (backslash) “\
”. Это может привести к небольшой путанице у неопытных программистов. Дело в том, что во многих языка программирования (и в python
, в том числе) символ “\
” внутри строк зарезервирован для экранирования, т.е. если внутри строки встречается “
“, то он интерпретируется не буквально как символ обратного слеша, а изменяет смысл следующего за ним символом. Так, например, последовательность "\n"
представляет собой один управляющий символ перевода строки.
new_line = "\n" print(len(new_line))
Это значит, что если вы попробуете записать Windows
путь не учитывая эту особенность, то высока вероятность получить не тот результат, который вы ожидали. Например, строка "C:\Users"
вообще не корректна с точки зрения синтаксиса python
:
users_folder = "C:\Users"
Input In [10] users_folder = "C:\Users" ^ SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape
Это объясняется тем, что последовательность "\U"
используется для экранирования unicode
последовательностей, а набор символов "sers"
не является корректным unicode
кодом. Ниже приводится пример корректного unicode
кода.
snake_emoji = "\U0001F40D" print(snake_emoji)
В python
предусмотренно как минимум два подхода борьбы с этой проблемой.
Первый из них опирается на удвоение количества символов “\
”. Дело в том, что в последовательности символов “\\
” — первый обратный слеш экранирует второй, т.е. итоговый результат эквивалентен одному настоящему символу обратного слеша.
users_folder = "C:\\Users" print(users_folder) new_line = "\\n" print(len(new_line))
Второй способ опирается на использование так называемых сырых (raw) строк: если перед началом литерала строки поставить символ “r
”, то символ обратного слеша теряет свою особую роль внутри неё.
users_folder = r"C:\Users" print(users_folder) new_line = r"\n" print(len(new_line))
Сам факт того, что при ручном прописывании пути в виде строки приходится проявлять дополнительную бдительность намекает на то, что должен быть более осмысленный способ составлении пути.
/
Соединение элементов пути#
Рассмотрим конкретный пример. Пусть у нас имеется строка folder
, представляющая путь к каталогу, и строка filename
, представляющее имя некоего файла внутри этого каталога.
folder = "directory" filename = "file.txt"
Чтобы открыть этот файл, нам потребуется соединить эти две строки, учитывая разделитель каталогов.
Конечно, можно вспомнить, что путь — строка, а значит их можно конкатенировать. Но, что если кто-то захочет запустить ваш код на машине с другой операционной системой? Гораздо целесообразнее воспользоваться для этих целей специальными средствами. Самый надежный способ — метод os.path.join, который на вход принимает произвольное количество имен файлов и соединяет их тем символом, который используется в качестве разделителя на той конкретной машине, на которой скрипт запущен сейчас.
import os path = os.path.join(folder, filename) print(path)
Альтернативой является модуль pathlib, который позволяет обращаться с путями файловой системы в объектно ориентированном стиле, т.е. путь больше не представляется в виде строки, а в виде специального объекта, который в любой момент может быть приведен к строке конструктором строки str.
Для создания такого объекта обычно используют класс Path, при создании экземпляра которого учитывается операционная система, на которой запущен данный скрипт.
from pathlib import Path folder = Path(folder) print(f"{folder=}, {str(folder)=}")
folder=WindowsPath('directory'), str(folder)='directory'
В ячейке выше создается объект типа Path
из строки folder
и вывод сообщает, что создался объект WindowsPath('directory
. Обратите внимание, что автоматически создался путь OS Windows
, т.к. этот скрипт запускался под управлением этой операционной системы.
Чтобы присоединить имя файла к объекту folder
, можно использовать оператор “/
” вне зависимости от операционной системы.
path = folder / filename print(f"{path=}, {str(path)=}")
path=WindowsPath('directory/file.txt'), str(path)='directory\\file.txt'
Обратите внимание на то, что при приведении к строке автоматически получилась строка с разделителем в стиле OS Windows
, т.к. при генерации материалов использовался компьютер под управлением OS Windows
.
Автор курса рекомендует всегда использовать средства модулей os.path или pathlib, даже если вам известно, что ваш скрипт будет запускаться под управлением какой-то конкретной операционной системы, чтобы писать более надежный код и формировать полезные привычки.
Извлечение элементов из пути#
Иногда может стоять обратная задача: дан путь, а из него надо что-то извлечь.
path = r"C:\Users\fadeev\folder\file.txt"
Метод os.path.splitdrive разбивает строку на логический раздел и остальное (актуально в основном на OS Windows
).
print(f"{path=}") drive, tail = os.path.splitdrive(path) print(f"{drive=}, {tail=}")
path='C:\\Users\\fadeev\\folder\\file.txt' drive='C:', tail='\\Users\\fadeev\\folder\\file.txt'
Метод os.path.dirname выделяет из пути родительский каталог.
parent_folder = os.path.dirname(path) print(f"{parent_folder=}")
parent_folder='C:\\Users\\fadeev\\folder'
Метод os.path.basename наоборот извлекает имя файла или папки, на которую данный путь указывает без учета родительского каталога.
filename = os.path.basename(path) print(f"{filename=}")
Метаинформация файла/каталога#
Имея путь, можно запрашивать у операционной системы информацию о том, что находится по этому пути. Важно понимать, что на этом этапе всегда происходит запрос к операционной системе и, если у запустившего программу пользователя не хватает привилегий для выполнения запрошенной операции, то в зависимости от операционной системы вы можете получить разные ответы.
Самый фундаментальный вопрос, который можно задать — существует ли вообще что-нибудь по указанному пути? Метод os.path.exists отвечает как раз на этот вопрос.
print(f"{os.path.exists(path)=}, {os.path.exists('filesystem.ipynb')=}")
os.path.exists(path)=False, os.path.exists('filesystem.ipynb')=True
Методы os.path.isdir и os.path.isfile позволяют определить располагает ли по этому пути каталог или файл соответственно. Оба метода возвращают False
, если по переданному пути ничего не располагается.
print(f"{os.path.isdir(folder)=}, {os.path.isfile('filesystem.ipynb')=}")
os.path.isdir(folder)=True, os.path.isfile('filesystem.ipynb')=True
Также иногда бывает полезно узнать время создания (последнего изменения) или последнего доступа к файлу или каталогу. Для этих целей существуют методы os.path.getatime, os.path.getmtime и os.path.getctime. Размер файла можно узнать методом os.path.getsize.
Содержимое каталога#
В ряде задач может потребоваться узнать содержимое определенного каталога, например, чтобы потом в цикле обработать каждый элемент каталога. В самых простых случаях достаточно метода os.listdir, который возвращает список файлов/каталогов в указанной директории. По умолчанию — текущая директория.
for filename in os.listdir(): print(filename, end=" ")
.ipynb_checkpoints about_12_and_so_on.ipynb about_python.ipynb argparse.ipynb custom_classes.ipynb custom_exceptions.ipynb decorators.ipynb dictionaries.ipynb dynamic_typing.ipynb exceptions.ipynb exercises1.ipynb exercises2.ipynb exercises3.ipynb files.ipynb filesystem.ipynb functions.ipynb garbage_collector.ipynb generators.ipynb if_for_range.ipynb inheritance.ipynb iterators.ipynb json.ipynb jupyter.ipynb LBYL_vs_EAFP.ipynb list_comprehensions.ipynb mutability.ipynb numbers_and_lists.ipynb operators_overloading.ipynb polymorphism.ipynb python_scripts.ipynb scripts_vs_modules.ipynb sequencies.ipynb tmp
Важно помнить, что согласно документации этот метод возвращает список файлов в произвольном порядке, т.е. он ни коим образом не отсортирован. Если требуется отсортировать их по названию, например, в алфавитном порядке, то можно воспользоваться встроенной функцией sorted. Практически во всех остальных случаях лучше выбрать os.scandir, которая не только возвращает содержимое каталога (тоже в произвольном порядке), но и метаинформацию о каждом файле.
Метод glob.glob модуля стандартной библиотеки glob позволяет фильтровать содержимое каталога на основе шаблона. В ячейке ниже демонстрируется, как можно найти все файлы в каталоге, которые начинаются с символа “a
”, а завершаются расширением “.ipynb
”.
import glob for filename in glob.glob("a*.ipynb"): print(filename)
about_12_and_so_on.ipynb about_python.ipynb argparse.ipynb
Создание, копирование, перемещение и удаление файлов и каталогов#
Метод os.mkdir создаёт каталог, но две особенности:
-
если такой каталог уже существует, то бросается исключение;
-
если родительского каталога не существует, то тоже бросается исключение.
Альтернативой является метод os.makedirs имеет опциональный параметр exist_ok
, который позволяет игнорировать ошибку, возникающую при попытке создать уже существующий каталог. Кроме того, если для создания указанного каталога, потребуется создать несколько директорий по пути, то они тоже будут созданы.
Таким образом метод os.mkdir более осторожный, т.к. он точно даст знать, если вы пытаетесь повторно создать директорию, а также если вы где-то ошиблись в пути, а метод os.makedirs более гибкий, позволяющий сократить объем кода, но если вы ошиблись при составлении желаемого пути (например, опечатались в имени одного каталога), то вы не получите никакого сообщения об ошибке и итоговая директория все равно будет создана.
Модуль стандартной библиотеки shutil содержит набор методов, имитирующих методы командной строки, что позволяет копировать файлы (методы shutil.copy, shutil.copy2 и shutil.copyfile), копировать директории с их содержимым (метод shutil.copytree), удалять директории (метод shutil.rmtree) и перемещать файлы или директории (метод shutil.move).
Удалять файлы можно методом os.remove.
When working with file paths in Python, it is important to properly write the path in the code to avoid any issues. In particular, when dealing with Windows paths, special attention must be given due to the backslash (\) acting as an escape character in Python string literals. This article will explain the best practices for writing Windows paths in Python string literals, along with examples to illustrate the solutions.
Understanding the Issue
The problem arises when trying to write a Windows path directly in a Python string literal. For example, let’s say we want to refer to the path C:\meshes\as. If we write it as «C:\meshes\as», we will encounter problems. This is because the backslash (\) is being treated as an escape character, leading to unexpected behavior.
One possible solution to this issue is to use raw strings by prefixing the string literal with the letter ‘r’. This tells Python to treat the string as a raw string, ignoring any escape characters. So, instead of «C:\meshes\as», we can write r»C:\meshes\as». This ensures that the path is interpreted correctly and avoids any escape character conflicts.
path = r"C:\meshes\as"
print(path)
# Output: C:\meshes\as
Alternate Escaping
Another way to solve this problem is by using double backslashes (\\) to escape the backslash character. This is because one backslash is treated as an escape character, but two backslashes are interpreted as a single backslash. So, instead of «C:\meshes\as», we can write «C:\\meshes\\as». This method can be used even without raw strings.
path = "C:\\meshes\\as"
print(path)
# Output: C:\meshes\as
Using os.path Module
Python’s os.path module provides a platform-independent way to handle file paths. It automatically adapts to the current operating system and handles the differences in path formats. Instead of manually writing the paths as discussed before, it is recommended to use the os.path functions to manipulate and construct file paths.
For example, to join two directory paths, you can use the os.path.join() function. This function takes care of the platform-specific separator and avoids any manual path string manipulation.
import os
directory1 = "C:\\meshes"
directory2 = "as"
path = os.path.join(directory1, directory2)
print(path)
# Output: C:\meshes\as
The os.path module also provides various other functions to perform operations on file paths, such as os.path.dirname() to get the directory name from a path, os.path.abspath() to get the absolute path, and many more. It is recommended to use these functions to ensure correct and portable file path handling in Python programs.
Conclusion
When writing Windows paths in Python string literals, it is important to consider how the backslash (\) is treated as an escape character. To avoid any issues, you can use raw strings by prefixing the string with the letter ‘r’ (e.g., r»C:\meshes\as»), or use double backslashes (\\) to explicitly escape the backslash character (e.g., «C:\\meshes\\as»). Alternatively, you can use the os.path module from Python’s standard library to handle file paths in a platform-independent manner. By following these best practices, you can ensure that your Python code correctly deals with Windows paths.
Manipulating filesystem paths as string objects can quickly become cumbersome:
multiple calls to os.path.join()
or os.path.dirname()
, etc.
This module offers a set of classes featuring all the common operations on
paths in an easy, object-oriented way.
This module is best used with Python 3.2 or later, but it is also compatible
with Python 2.7.
Note
This module has been included
in the Python 3.4 standard library after PEP 428 acceptance. You only
need to install it for Python 3.3 or older.
See also
PEP 428: Rationale for the final pathlib design and API.
High-level view¶
This module offers classes representing filesystem paths with semantics
appropriate for different operating systems. Path classes are divided
between pure paths, which provide purely computational
operations without I/O, and concrete paths, which
inherit from pure paths but also provide I/O operations.
If you’ve never used this module before or just aren’t sure which class is
right for your task, Path
is most likely what you need. It instantiates
a concrete path for the platform the code is running on.
Pure paths are useful in some special cases; for example:
- If you want to manipulate Windows paths on a Unix machine (or vice versa).
You cannot instantiate aWindowsPath
when running on Unix, but you
can instantiatePureWindowsPath
. - You want to make sure that your code only manipulates paths without actually
accessing the OS. In this case, instantiating one of the pure classes may be
useful since those simply don’t have any OS-accessing operations.
Basic use¶
Importing the module classes:
>>> from pathlib import *
Listing subdirectories:
>>> p = Path('.') >>> [x for x in p.iterdir() if x.is_dir()] [PosixPath('.hg'), PosixPath('docs'), PosixPath('dist'), PosixPath('__pycache__'), PosixPath('build')]
Listing Python source files in this directory tree:
>>> list(p.glob('**/*.py')) [PosixPath('test_pathlib.py'), PosixPath('setup.py'), PosixPath('pathlib.py'), PosixPath('docs/conf.py'), PosixPath('build/lib/pathlib.py')]
Navigating inside a directory tree:
>>> p = Path('/etc') >>> q = p / 'init.d' / 'reboot' >>> q PosixPath('/etc/init.d/reboot') >>> q.resolve() PosixPath('/etc/rc.d/init.d/halt')
Querying path properties:
>>> q.exists() True >>> q.is_dir() False
Opening a file:
>>> with q.open() as f: f.readline() ... '#!/bin/bash\n'
Pure paths¶
Pure path objects provide path-handling operations which don’t actually
access a filesystem. There are three ways to access these classes, which
we also call flavours:
-
class
pathlib.
PurePath
(*pathsegments)¶ -
A generic class that represents the system’s path flavour (instantiating
it creates either aPurePosixPath
or aPureWindowsPath
):>>> PurePath('setup.py') # Running on a Unix machine PurePosixPath('setup.py')
Each element of pathsegments can be either a string or bytes object
representing a path segment; it can also be another path object:>>> PurePath('foo', 'some/path', 'bar') PurePosixPath('foo/some/path/bar') >>> PurePath(Path('foo'), Path('bar')) PurePosixPath('foo/bar')
When pathsegments is empty, the current directory is assumed:
>>> PurePath() PurePosixPath('.')
When several absolute paths are given, the last is taken as an anchor
(mimickingos.path.join()
‘s behaviour):>>> PurePath('/etc', '/usr', 'lib64') PurePosixPath('/usr/lib64') >>> PureWindowsPath('c:/Windows', 'd:bar') PureWindowsPath('d:bar')
However, in a Windows path, changing the local root doesn’t discard the
previous drive setting:>>> PureWindowsPath('c:/Windows', '/Program Files') PureWindowsPath('c:/Program Files')
Spurious slashes and single dots are collapsed, but double dots (
'..'
)
are not, since this would change the meaning of a path in the face of
symbolic links:>>> PurePath('foo//bar') PurePosixPath('foo/bar') >>> PurePath('foo/./bar') PurePosixPath('foo/bar') >>> PurePath('foo/../bar') PurePosixPath('foo/../bar')
(a naïve approach would make
PurePosixPath('foo/../bar')
equivalent
toPurePosixPath('bar')
, which is wrong iffoo
is a symbolic link
to another directory)
-
class
pathlib.
PurePosixPath
(*pathsegments)¶ -
A subclass of
PurePath
, this path flavour represents non-Windows
filesystem paths:>>> PurePosixPath('/etc') PurePosixPath('/etc')
pathsegments is specified similarly to
PurePath
.
-
class
pathlib.
PureWindowsPath
(*pathsegments)¶ -
A subclass of
PurePath
, this path flavour represents Windows
filesystem paths:>>> PureWindowsPath('c:/Program Files/') PureWindowsPath('c:/Program Files')
pathsegments is specified similarly to
PurePath
.
Regardless of the system you’re running on, you can instantiate all of
these classes, since they don’t provide any operation that does system calls.
General properties¶
Paths are immutable and hashable. Paths of a same flavour are comparable
and orderable. These properties respect the flavour’s case-folding
semantics:
>>> PurePosixPath('foo') == PurePosixPath('FOO') False >>> PureWindowsPath('foo') == PureWindowsPath('FOO') True >>> PureWindowsPath('FOO') in { PureWindowsPath('foo') } True >>> PureWindowsPath('C:') < PureWindowsPath('d:') True
Paths of a different flavour compare unequal and cannot be ordered:
>>> PureWindowsPath('foo') == PurePosixPath('foo') False >>> PureWindowsPath('foo') < PurePosixPath('foo') Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unorderable types: PureWindowsPath() < PurePosixPath()
Operators¶
The slash operator helps create child paths, similarly to os.path.join
:
>>> p = PurePath('/etc') >>> p PurePosixPath('/etc') >>> p / 'init.d' / 'apache2' PurePosixPath('/etc/init.d/apache2') >>> q = PurePath('bin') >>> '/usr' / q PurePosixPath('/usr/bin')
The string representation of a path is the raw filesystem path itself
(in native form, e.g. with backslashes under Windows), which you can
pass to any function taking a file path as a string:
>>> p = PurePath('/etc') >>> str(p) '/etc' >>> p = PureWindowsPath('c:/Program Files') >>> str(p) 'c:\\Program Files'
Similarly, calling bytes
on a path gives the raw filesystem path as a
bytes object, as encoded by os.fsencode
:
Accessing individual parts¶
To access the individual “parts” (components) of a path, use the following
property:
-
PurePath.
parts
¶ -
A tuple giving access to the path’s various components:
>>> p = PurePath('/usr/bin/python3') >>> p.parts ('/', 'usr', 'bin', 'python3') >>> p = PureWindowsPath('c:/Program Files/PSF') >>> p.parts ('c:\\', 'Program Files', 'PSF')
(note how the drive and local root are regrouped in a single part)
Methods and properties¶
Pure paths provide the following methods and properties:
-
PurePath.
drive
¶ -
A string representing the drive letter or name, if any:
>>> PureWindowsPath('c:/Program Files/').drive 'c:' >>> PureWindowsPath('/Program Files/').drive '' >>> PurePosixPath('/etc').drive ''
UNC shares are also considered drives:
>>> PureWindowsPath('//host/share/foo.txt').drive '\\\\host\\share'
-
PurePath.
root
¶ -
A string representing the (local or global) root, if any:
>>> PureWindowsPath('c:/Program Files/').root '\\' >>> PureWindowsPath('c:Program Files/').root '' >>> PurePosixPath('/etc').root '/'
UNC shares always have a root:
>>> PureWindowsPath('//host/share').root '\\'
-
PurePath.
anchor
¶ -
The concatenation of the drive and root:
>>> PureWindowsPath('c:/Program Files/').anchor 'c:\\' >>> PureWindowsPath('c:Program Files/').anchor 'c:' >>> PurePosixPath('/etc').anchor '/' >>> PureWindowsPath('//host/share').anchor '\\\\host\\share\\'
-
PurePath.
parents
¶ -
An immutable sequence providing access to the logical ancestors of
the path:>>> p = PureWindowsPath('c:/foo/bar/setup.py') >>> p.parents[0] PureWindowsPath('c:/foo/bar') >>> p.parents[1] PureWindowsPath('c:/foo') >>> p.parents[2] PureWindowsPath('c:/')
-
PurePath.
parent
¶ -
The logical parent of the path:
>>> p = PurePosixPath('/a/b/c/d') >>> p.parent PurePosixPath('/a/b/c')
You cannot go past an anchor, or empty path:
>>> p = PurePosixPath('/') >>> p.parent PurePosixPath('/') >>> p = PurePosixPath('.') >>> p.parent PurePosixPath('.')
Note
This is a purely lexical operation, hence the following behaviour:
>>> p = PurePosixPath('foo/..') >>> p.parent PurePosixPath('foo')
If you want to walk an arbitrary filesystem path upwards, it is
recommended to first callPath.resolve()
so as to resolve
symlinks and eliminate ”..” components.
-
PurePath.
name
¶ -
A string representing the final path component, excluding the drive and
root, if any:>>> PurePosixPath('my/library/setup.py').name 'setup.py'
UNC drive names are not considered:
>>> PureWindowsPath('//some/share/setup.py').name 'setup.py' >>> PureWindowsPath('//some/share').name ''
-
PurePath.
suffix
¶ -
The file extension of the final component, if any:
>>> PurePosixPath('my/library/setup.py').suffix '.py' >>> PurePosixPath('my/library.tar.gz').suffix '.gz' >>> PurePosixPath('my/library').suffix ''
-
PurePath.
suffixes
¶ -
A list of the path’s file extensions:
>>> PurePosixPath('my/library.tar.gar').suffixes ['.tar', '.gar'] >>> PurePosixPath('my/library.tar.gz').suffixes ['.tar', '.gz'] >>> PurePosixPath('my/library').suffixes []
-
PurePath.
stem
¶ -
The final path component, without its suffix:
>>> PurePosixPath('my/library.tar.gz').stem 'library.tar' >>> PurePosixPath('my/library.tar').stem 'library' >>> PurePosixPath('my/library').stem 'library'
-
PurePath.
as_posix
()¶ -
Return a string representation of the path with forward slashes (
/
):>>> p = PureWindowsPath('c:\\windows') >>> str(p) 'c:\\windows' >>> p.as_posix() 'c:/windows'
-
PurePath.
as_uri
()¶ -
Represent the path as a
file
URI.ValueError
is raised if
the path isn’t absolute.>>> p = PurePosixPath('/etc/passwd') >>> p.as_uri() 'file:///etc/passwd' >>> p = PureWindowsPath('c:/Windows') >>> p.as_uri() 'file:///c:/Windows'
-
PurePath.
is_absolute
()¶ -
Return whether the path is absolute or not. A path is considered absolute
if it has both a root and (if the flavour allows) a drive:>>> PurePosixPath('/a/b').is_absolute() True >>> PurePosixPath('a/b').is_absolute() False >>> PureWindowsPath('c:/a/b').is_absolute() True >>> PureWindowsPath('/a/b').is_absolute() False >>> PureWindowsPath('c:').is_absolute() False >>> PureWindowsPath('//some/share').is_absolute() True
-
PurePath.
is_reserved
()¶ -
With
PureWindowsPath
, returnTrue
if the path is considered
reserved under Windows,False
otherwise. WithPurePosixPath
,
False
is always returned.>>> PureWindowsPath('nul').is_reserved() True >>> PurePosixPath('nul').is_reserved() False
File system calls on reserved paths can fail mysteriously or have
unintended effects.
-
PurePath.
joinpath
(*other)¶ -
Calling this method is equivalent to combining the path with each of
the other arguments in turn:>>> PurePosixPath('/etc').joinpath('passwd') PurePosixPath('/etc/passwd') >>> PurePosixPath('/etc').joinpath(PurePosixPath('passwd')) PurePosixPath('/etc/passwd') >>> PurePosixPath('/etc').joinpath('init.d', 'apache2') PurePosixPath('/etc/init.d/apache2') >>> PureWindowsPath('c:').joinpath('/Program Files') PureWindowsPath('c:/Program Files')
-
PurePath.
match
(pattern)¶ -
Match this path against the provided glob-style pattern. Return
True
if matching is successful,False
otherwise.If pattern is relative, the path can be either relative or absolute,
and matching is done from the right:>>> PurePath('a/b.py').match('*.py') True >>> PurePath('/a/b/c.py').match('b/*.py') True >>> PurePath('/a/b/c.py').match('a/*.py') False
If pattern is absolute, the path must be absolute, and the whole path
must match:>>> PurePath('/a.py').match('/*.py') True >>> PurePath('a/b.py').match('/*.py') False
As with other methods, case-sensitivity is observed:
>>> PureWindowsPath('b.py').match('*.PY') True
-
PurePath.
relative_to
(*other)¶ -
Compute a version of this path relative to the path represented by
other. If it’s impossible, ValueError is raised:>>> p = PurePosixPath('/etc/passwd') >>> p.relative_to('/') PurePosixPath('etc/passwd') >>> p.relative_to('/etc') PurePosixPath('passwd') >>> p.relative_to('/usr') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "pathlib.py", line 694, in relative_to .format(str(self), str(formatted))) ValueError: '/etc/passwd' does not start with '/usr'
-
PurePath.
with_name
(name)¶ -
Return a new path with the
name
changed. If the original path
doesn’t have a name, ValueError is raised:>>> p = PureWindowsPath('c:/Downloads/pathlib.tar.gz') >>> p.with_name('setup.py') PureWindowsPath('c:/Downloads/setup.py') >>> p = PureWindowsPath('c:/') >>> p.with_name('setup.py') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/antoine/cpython/default/Lib/pathlib.py", line 751, in with_name raise ValueError("%r has an empty name" % (self,)) ValueError: PureWindowsPath('c:/') has an empty name
-
PurePath.
with_suffix
(suffix)¶ -
Return a new path with the
suffix
changed. If the original path
doesn’t have a suffix, the new suffix is appended instead:>>> p = PureWindowsPath('c:/Downloads/pathlib.tar.gz') >>> p.with_suffix('.bz2') PureWindowsPath('c:/Downloads/pathlib.tar.bz2') >>> p = PureWindowsPath('README') >>> p.with_suffix('.txt') PureWindowsPath('README.txt')
Concrete paths¶
Concrete paths are subclasses of the pure path classes. In addition to
operations provided by the latter, they also provide methods to do system
calls on path objects. There are three ways to instantiate concrete paths:
-
class
pathlib.
Path
(*pathsegments)¶ -
A subclass of
PurePath
, this class represents concrete paths of
the system’s path flavour (instantiating it creates either a
PosixPath
or aWindowsPath
):>>> Path('setup.py') PosixPath('setup.py')
pathsegments is specified similarly to
PurePath
.
-
class
pathlib.
PosixPath
(*pathsegments)¶ -
A subclass of
Path
andPurePosixPath
, this class
represents concrete non-Windows filesystem paths:>>> PosixPath('/etc') PosixPath('/etc')
pathsegments is specified similarly to
PurePath
.
-
class
pathlib.
WindowsPath
(*pathsegments)¶ -
A subclass of
Path
andPureWindowsPath
, this class
represents concrete Windows filesystem paths:>>> WindowsPath('c:/Program Files/') WindowsPath('c:/Program Files')
pathsegments is specified similarly to
PurePath
.
You can only instantiate the class flavour that corresponds to your system
(allowing system calls on non-compatible path flavours could lead to
bugs or failures in your application):
>>> import os >>> os.name 'posix' >>> Path('setup.py') PosixPath('setup.py') >>> PosixPath('setup.py') PosixPath('setup.py') >>> WindowsPath('setup.py') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "pathlib.py", line 798, in __new__ % (cls.__name__,)) NotImplementedError: cannot instantiate 'WindowsPath' on your system
Methods¶
Concrete paths provide the following methods in addition to pure paths
methods. Many of these methods can raise an OSError
if a system
call fails (for example because the path doesn’t exist):
-
classmethod
Path.
cwd
()¶ -
Return a new path object representing the current directory (as returned
byos.getcwd()
):>>> Path.cwd() PosixPath('/home/antoine/pathlib')
-
Path.
stat
()¶ -
Return information about this path (similarly to
os.stat()
).
The result is looked up at each call to this method.>>> p = Path('setup.py') >>> p.stat().st_size 956 >>> p.stat().st_mtime 1327883547.852554
-
Path.
chmod
(mode)¶ -
Change the file mode and permissions, like
os.chmod()
:>>> p = Path('setup.py') >>> p.stat().st_mode 33277 >>> p.chmod(0o444) >>> p.stat().st_mode 33060
-
Path.
exists
()¶ -
Whether the path points to an existing file or directory:
>>> from pathlib import * >>> Path('.').exists() True >>> Path('setup.py').exists() True >>> Path('/etc').exists() True >>> Path('nonexistentfile').exists() False
-
Path.
glob
(pattern)¶ -
Glob the given pattern in the directory represented by this path,
yielding all matching files (of any kind):>>> sorted(Path('.').glob('*.py')) [PosixPath('pathlib.py'), PosixPath('setup.py'), PosixPath('test_pathlib.py')] >>> sorted(Path('.').glob('*/*.py')) [PosixPath('docs/conf.py')]
The “
**
” pattern means “this directory and all subdirectories,
recursively”. In other words, it enables recursive globbing:>>> sorted(Path('.').glob('**/*.py')) [PosixPath('build/lib/pathlib.py'), PosixPath('docs/conf.py'), PosixPath('pathlib.py'), PosixPath('setup.py'), PosixPath('test_pathlib.py')]
Note
Using the “
**
” pattern in large directory trees may consume
an inordinate amount of time.
-
Path.
group
()¶ -
Return the name of the group owning the file.
KeyError
is raised
if the file’s gid isn’t found in the system database.
-
Path.
is_dir
()¶ -
Return
True
if the path points to a directory (or a symbolic link
pointing to a directory),False
if it points to another kind of file.False
is also returned if the path doesn’t exist or is a broken symlink;
other errors (such as permission errors) are propagated.
-
Path.
is_file
()¶ -
Return
True
if the path points to a regular file (or a symbolic link
pointing to a regular file),False
if it points to another kind of file.False
is also returned if the path doesn’t exist or is a broken symlink;
other errors (such as permission errors) are propagated.
-
Path.
is_symlink
()¶ -
Return
True
if the path points to a symbolic link,False
otherwise.False
is also returned if the path doesn’t exist; other errors (such
as permission errors) are propagated.
-
Path.
is_socket
()¶ -
Return
True
if the path points to a Unix socket (or a symbolic link
pointing to a Unix socket),False
if it points to another kind of file.False
is also returned if the path doesn’t exist or is a broken symlink;
other errors (such as permission errors) are propagated.
-
Path.
is_fifo
()¶ -
Return
True
if the path points to a FIFO (or a symbolic link
pointing to a FIFO),False
if it points to another kind of file.False
is also returned if the path doesn’t exist or is a broken symlink;
other errors (such as permission errors) are propagated.
-
Path.
is_block_device
()¶ -
Return
True
if the path points to a block device (or a symbolic link
pointing to a block device),False
if it points to another kind of file.False
is also returned if the path doesn’t exist or is a broken symlink;
other errors (such as permission errors) are propagated.
-
Path.
is_char_device
()¶ -
Return
True
if the path points to a character device (or a symbolic link
pointing to a character device),False
if it points to another kind of file.False
is also returned if the path doesn’t exist or is a broken symlink;
other errors (such as permission errors) are propagated.
-
Path.
iterdir
()¶ -
When the path points to a directory, yield path objects of the directory
contents:>>> p = Path('docs') >>> for child in p.iterdir(): child ... PosixPath('docs/conf.py') PosixPath('docs/_templates') PosixPath('docs/make.bat') PosixPath('docs/index.rst') PosixPath('docs/_build') PosixPath('docs/_static') PosixPath('docs/Makefile')
-
Path.
lchmod
(mode)¶ -
Like
Path.chmod()
but, if the path points to a symbolic link, the
symbolic link’s mode is changed rather than its target’s.
-
Path.
lstat
()¶ -
Like
Path.stat()
but, if the path points to a symbolic link, return
the symbolic link’s information rather than its target’s.
-
Path.
mkdir
(mode=0o777, parents=False)¶ -
Create a new directory at this given path. If mode is given, it is
combined with the process’umask
value to determine the file mode
and access flags. If the path already exists,OSError
is raised.If parents is true, any missing parents of this path are created
as needed; they are created with the default permissions without taking
mode into account (mimicking the POSIXmkdir -p
command).If parents is false (the default), a missing parent raises
OSError
.
-
Path.
open
(mode=’r’, buffering=-1, encoding=None, errors=None, newline=None)¶ -
Open the file pointed to by the path, like the built-in
open()
function does:>>> p = Path('setup.py') >>> with p.open() as f: ... f.readline() ... '#!/usr/bin/env python3\n'
-
Path.
owner
()¶ -
Return the name of the user owning the file.
KeyError
is raised
if the file’s uid isn’t found in the system database.
-
Path.
rename
(target)¶ -
Rename this file or directory to the given target. target can be
either a string or another path object:>>> p = Path('foo') >>> p.open('w').write('some text') 9 >>> target = Path('bar') >>> p.rename(target) >>> target.open().read() 'some text'
-
Path.
replace
(target)¶ -
Rename this file or directory to the given target. If target points
to an existing file or directory, it will be unconditionally replaced.This method is only available with Python 3.3; it will raise
NotImplementedError
on previous Python versions.
-
Path.
resolve
()¶ -
Make the path absolute, resolving any symlinks. A new path object is
returned:>>> p = Path() >>> p PosixPath('.') >>> p.resolve() PosixPath('/home/antoine/pathlib')
”..” components are also eliminated (this is the only method to do so):
>>> p = Path('docs/../setup.py') >>> p.resolve() PosixPath('/home/antoine/pathlib/setup.py')
If the path doesn’t exist, an
OSError
is raised. If an infinite
loop is encountered along the resolution path,RuntimeError
is
raised.
-
Path.
rglob
(pattern)¶ -
This is like calling
glob()
with “**
” added in front of the
given pattern:>>> sorted(Path().rglob("*.py")) [PosixPath('build/lib/pathlib.py'), PosixPath('docs/conf.py'), PosixPath('pathlib.py'), PosixPath('setup.py'), PosixPath('test_pathlib.py')]
-
Path.
rmdir
()¶ -
Remove this directory. The directory must be empty.
-
Path.
symlink_to
(target, target_is_directory=False)¶ -
Make this path a symbolic link to target. Under Windows,
target_is_directory must be true (defaultFalse
) if the link’s target
is a directory. Under POSIX, target_is_directory‘s value is ignored.>>> p = Path('mylink') >>> p.symlink_to('setup.py') >>> p.resolve() PosixPath('/home/antoine/pathlib/setup.py') >>> p.stat().st_size 956 >>> p.lstat().st_size 8
Note
The order of arguments (link, target) is the reverse
ofos.symlink()
‘s.
-
Path.
touch
(mode=0o777, exist_ok=True)¶ -
Create a file at this given path. If mode is given, it is combined
with the process’umask
value to determine the file mode and access
flags. If the file already exists, the function succeeds if exist_ok
is true (and its modification time is updated to the current time),
otherwiseOSError
is raised.
-
Path.
unlink
()¶ -
Remove this file or symbolic link. If the path points to a directory,
usePath.rmdir()
instead.