liaoch commited on
Commit
dd5c2cf
·
1 Parent(s): 1ce97d9

Update for Docker Spaces best practices: Add app_port to README.md and improve Dockerfile with user permissions

Browse files
Files changed (3) hide show
  1. Dockerfile +20 -9
  2. README.md +1 -0
  3. docs/docker-spaces.md +119 -0
Dockerfile CHANGED
@@ -53,24 +53,35 @@ RUN apt-get update && \
53
  apt-get clean && \
54
  rm -rf /var/lib/apt/lists/*
55
 
56
- # Set the working directory in the container
57
- WORKDIR /app
 
 
 
 
 
 
 
 
 
 
58
 
59
  # Install mermaid-cli globally
60
  # Use --unsafe-perm if needed for permissions during global install
61
  RUN npm install -g @mermaid-js/mermaid-cli --unsafe-perm=true
62
 
63
  # Copy the requirements file first to leverage Docker cache
64
- COPY requirements.txt ./
 
65
 
66
  # Create a virtual environment and install Python dependencies
67
  # This isolates Python packages within the container, similar to local setup
68
- RUN python3 -m venv /app/venv
69
  # Activate venv for the RUN command and install packages
70
- RUN . /app/venv/bin/activate && pip install --no-cache-dir -r requirements.txt
71
 
72
- # Copy the rest of the application code into the container
73
- COPY . .
74
 
75
  # Make port 80 available to the world outside this container (required for Hugging Face Spaces)
76
  EXPOSE 80
@@ -83,7 +94,7 @@ ENV FLASK_RUN_HOST=0.0.0.0
83
  ENV FLASK_RUN_PORT=80
84
  ENV FLASK_SECRET_KEY="replace_this_in_docker_run_with_a_real_secret"
85
  # Add venv's bin to the PATH for subsequent commands (like CMD)
86
- ENV PATH="/app/venv/bin:$PATH"
87
 
88
  # Override base image entrypoint so CMD executes directly
89
  ENTRYPOINT []
@@ -94,4 +105,4 @@ ENTRYPOINT []
94
  # Use port 80 (required for Hugging Face Spaces)
95
  # The number of workers (e.g., --workers 3) can be adjusted based on server resources
96
  # Use JSON form with the absolute path to gunicorn in the venv to avoid PATH issues
97
- CMD ["/app/venv/bin/gunicorn", "--workers", "3", "--bind", "0.0.0.0:80", "--timeout", "60", "app:app"]
 
53
  apt-get clean && \
54
  rm -rf /var/lib/apt/lists/*
55
 
56
+ # Set up a new user named "user" with user ID 1000 (required for Docker Spaces)
57
+ RUN useradd -m -u 1000 user
58
+
59
+ # Switch to the "user" user
60
+ USER user
61
+
62
+ # Set home to the user's home directory
63
+ ENV HOME=/home/user \
64
+ PATH=/home/user/.local/bin:$PATH
65
+
66
+ # Set the working directory to the user's home directory
67
+ WORKDIR $HOME/app
68
 
69
  # Install mermaid-cli globally
70
  # Use --unsafe-perm if needed for permissions during global install
71
  RUN npm install -g @mermaid-js/mermaid-cli --unsafe-perm=true
72
 
73
  # Copy the requirements file first to leverage Docker cache
74
+ # Copy with proper ownership
75
+ COPY --chown=user requirements.txt $HOME/app/
76
 
77
  # Create a virtual environment and install Python dependencies
78
  # This isolates Python packages within the container, similar to local setup
79
+ RUN python3 -m venv $HOME/app/venv
80
  # Activate venv for the RUN command and install packages
81
+ RUN $HOME/app/venv/bin/pip install --no-cache-dir -r $HOME/app/requirements.txt
82
 
83
+ # Copy the rest of the application code into the container with proper ownership
84
+ COPY --chown=user . $HOME/app
85
 
86
  # Make port 80 available to the world outside this container (required for Hugging Face Spaces)
87
  EXPOSE 80
 
94
  ENV FLASK_RUN_PORT=80
95
  ENV FLASK_SECRET_KEY="replace_this_in_docker_run_with_a_real_secret"
96
  # Add venv's bin to the PATH for subsequent commands (like CMD)
97
+ ENV PATH="$HOME/app/venv/bin:$PATH"
98
 
99
  # Override base image entrypoint so CMD executes directly
100
  ENTRYPOINT []
 
105
  # Use port 80 (required for Hugging Face Spaces)
106
  # The number of workers (e.g., --workers 3) can be adjusted based on server resources
107
  # Use JSON form with the absolute path to gunicorn in the venv to avoid PATH issues
108
+ CMD ["gunicorn", "--workers", "3", "--bind", "0.0.0.0:80", "--timeout", "60", "app:app"]
README.md CHANGED
@@ -4,6 +4,7 @@ emoji: 🔥
4
  colorFrom: purple
5
  colorTo: red
6
  sdk: docker
 
7
  pinned: false
8
  license: mit
9
  short_description: mermaid-rendering docker version
 
4
  colorFrom: purple
5
  colorTo: red
6
  sdk: docker
7
+ app_port: 80
8
  pinned: false
9
  license: mit
10
  short_description: mermaid-rendering docker version
docs/docker-spaces.md ADDED
@@ -0,0 +1,119 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Docker Spaces
2
+ Spaces accommodate custom Docker containers for apps outside the scope of Streamlit and Gradio. Docker Spaces allow users to go beyond the limits of what was previously possible with the standard SDKs. From FastAPI and Go endpoints to Phoenix apps and ML Ops tools, Docker Spaces can help in many different setups.
3
+
4
+ Setting up Docker Spaces
5
+ Selecting Docker as the SDK when creating a new Space will initialize your Space by setting the sdk property to docker in your README.md file’s YAML block. Alternatively, given an existing Space repository, set sdk: docker inside the YAML block at the top of your Spaces README.md file. You can also change the default exposed port 7860 by setting app_port: 7860. Afterwards, you can create a usual Dockerfile.
6
+
7
+ Copied
8
+
9
+ title: Basic Docker SDK Space
10
+ emoji: 🐳
11
+ colorFrom: purple
12
+ colorTo: gray
13
+ sdk: docker
14
+ app_port: 7860
15
+ Internally you could have as many open ports as you want. For instance, you can install Elasticsearch inside your Space and call it internally on its default port 9200.
16
+
17
+ If you want to expose apps served on multiple ports to the outside world, a workaround is to use a reverse proxy like Nginx to dispatch requests from the broader internet (on a single port) to different internal ports.
18
+
19
+ Secrets and Variables Management
20
+ You can manage a Space’s environment variables in the Space Settings. Read more here.
21
+
22
+ Variables
23
+ Buildtime
24
+ Variables are passed as build-args when building your Docker Space. Read Docker’s dedicated documentation for a complete guide on how to use this in the Dockerfile.
25
+
26
+ Copied
27
+ # Declare your environment variables with the ARG directive
28
+ ARG MODEL_REPO_NAME
29
+
30
+ FROM python:latest
31
+ # [...]
32
+ # You can use them like environment variables
33
+ RUN predict.py $MODEL_REPO_NAME
34
+ Runtime
35
+ Variables are injected in the container’s environment at runtime.
36
+
37
+ Secrets
38
+ Buildtime
39
+ In Docker Spaces, the secrets management is different for security reasons. Once you create a secret in the Settings tab, you can expose the secret by adding the following line in your Dockerfile:
40
+
41
+ For example, if SECRET_EXAMPLE is the name of the secret you created in the Settings tab, you can read it at build time by mounting it to a file, then reading it with $(cat /run/secrets/SECRET_EXAMPLE).
42
+
43
+ See an example below:
44
+
45
+ Copied
46
+ # Expose the secret SECRET_EXAMPLE at buildtime and use its value as git remote URL
47
+ RUN --mount=type=secret,id=SECRET_EXAMPLE,mode=0444,required=true \
48
+ git init && \
49
+ git remote add origin $(cat /run/secrets/SECRET_EXAMPLE)
50
+ Copied
51
+ # Expose the secret SECRET_EXAMPLE at buildtime and use its value as a Bearer token for a curl request
52
+ RUN --mount=type=secret,id=SECRET_EXAMPLE,mode=0444,required=true \
53
+ curl test -H 'Authorization: Bearer $(cat /run/secrets/SECRET_EXAMPLE)'
54
+ Runtime
55
+ Same as for public Variables, at runtime, you can access the secrets as environment variables. For example, in Python you would use os.environ.get("SECRET_EXAMPLE"). Check out this example of a Docker Space that uses secrets.
56
+
57
+ Permissions
58
+ The container runs with user ID 1000. To avoid permission issues you should create a user and set its WORKDIR before any COPY or download.
59
+
60
+ Copied
61
+ # Set up a new user named "user" with user ID 1000
62
+ RUN useradd -m -u 1000 user
63
+
64
+ # Switch to the "user" user
65
+ USER user
66
+
67
+ # Set home to the user's home directory
68
+ ENV HOME=/home/user \
69
+ PATH=/home/user/.local/bin:$PATH
70
+
71
+ # Set the working directory to the user's home directory
72
+ WORKDIR $HOME/app
73
+
74
+ # Try and run pip command after setting the user with `USER user` to avoid permission issues with Python
75
+ RUN pip install --no-cache-dir --upgrade pip
76
+
77
+ # Copy the current directory contents into the container at $HOME/app setting the owner to the user
78
+ COPY --chown=user . $HOME/app
79
+
80
+ # Download a checkpoint
81
+ RUN mkdir content
82
+ ADD --chown=user https://<SOME_ASSET_URL> content/<SOME_ASSET_NAME>
83
+ Always specify the `--chown=user` with `ADD` and `COPY` to ensure the new files are owned by your user.
84
+ If you still face permission issues, you might need to use chmod or chown in your Dockerfile to grant the right permissions. For example, if you want to use the directory /data, you can do:
85
+
86
+ Copied
87
+ RUN mkdir -p /data
88
+ RUN chmod 777 /data
89
+ You should always avoid superfluous chowns.
90
+
91
+ Updating metadata for a file creates a new copy stored in the new layer. Therefore, a recursive chown can result in a very large image due to the duplication of all affected files.
92
+ Rather than fixing permission by running chown:
93
+
94
+ Copied
95
+ COPY checkpoint .
96
+ RUN chown -R user checkpoint
97
+ you should always do:
98
+
99
+ Copied
100
+ COPY --chown=user checkpoint .
101
+ (same goes for ADD command)
102
+
103
+ Data Persistence
104
+ The data written on disk is lost whenever your Docker Space restarts, unless you opt-in for a persistent storage upgrade.
105
+
106
+ If you opt-in for a persistent storage upgrade, you can use the /data directory to store data. This directory is mounted on a persistent volume, which means that the data written in this directory will be persisted across restarts.
107
+
108
+ At the moment, /data volume is only available at runtime, i.e. you cannot use /data during the build step of your Dockerfile.
109
+
110
+ You can also use our Datasets Hub for specific cases, where you can store state and data in a git LFS repository. You can find an example of persistence here, which uses the huggingface_hub library for programmatically uploading files to a dataset repository. This Space example along with this guide will help you define which solution fits best your data type.
111
+
112
+ Finally, in some cases, you might want to use an external storage solution from your Space’s code like an external hosted DB, S3, etc.
113
+
114
+ Docker container with GPU
115
+ You can run Docker containers with GPU support by using one of our GPU-flavored Spaces Hardware.
116
+
117
+ We recommend using the nvidia/cuda from Docker Hub as a base image, which comes with CUDA and cuDNN pre-installed.
118
+
119
+ During Docker buildtime, you don't have access to a GPU hardware. Therefore, you should not try to run any GPU-related command during the build step of your Dockerfile. For example, you can't run `nvidia-smi` or `torch.cuda.is_available()` building an image. Read more [here](https://github.com/NVIDIA/nvidia-docker/wiki/nvidia-docker#description).