File size: 4,895 Bytes
4f09a38
 
 
 
 
 
e71a006
4f09a38
 
 
 
5ff9411
 
 
ca36951
 
 
5b67a75
ca36951
 
 
5b67a75
ca36951
 
 
d7027f5
ca36951
 
 
 
 
 
 
 
 
 
5b67a75
991299a
 
 
 
 
 
 
e30a902
 
 
 
d93f88f
 
e30a902
 
 
 
 
 
 
 
d7027f5
e30a902
d7027f5
ca36951
d93f88f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
bf71d0e
 
 
 
 
86e1d99
bf71d0e
 
 
 
 
d7027f5
 
 
 
 
 
 
115b856
d7027f5
 
 
 
115b856
 
 
d7027f5
 
 
 
 
 
 
ca36951
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
---
title: My Ghost Writer
emoji: ✍️
colorFrom: red
colorTo: blue
sdk: docker
app_port: 7860
pinned: true
license: agpl-3.0
---

# My Ghost Writer

A simple helper for writers.

## Overview

[My Ghost Writer](https://github.com/trincadev/my_ghost_writer/) is a web application that analyzes text and provides words frequency statistics. It allows users to upload or type in a text, and then displays the most common words, their frequencies and their position with the text editor. The application uses natural language processing (NLP) techniques to stem words, making it easier to identify patterns and trends in the text.

## Features

* Analyse large texts and provide words frequency statistics
* Use NLP to stem words for more accurate results
* Support for uploading or typing in text
* User-friendly interface with a simple editor and display of word frequencies
* WIP: thesaurus, powered by [wordsapi](https://www.wordsapi.com/) (you need to get your own wordsapi API key)

## Technologies Used

* Python 3.10+ (FastAPI web framework)
* a Vanilla JavaScript frontend, [playwright](https://playwright.dev/) for E2E testing
* [`nltk`](https://www.nltk.org/) library for natural language processing
* [`structlog`](https://www.structlog.org/) for logging and error handling

## Getting Started

In a Linux/WSL environment (I didn't tried with MacOS or Windows):

1. Clone the repository using `git clone https://github.com/trincadev/my_ghost_writer`, `cd my_ghost_writer`
2. Create a [virtualenv](https://virtualenv.pypa.io/en/latest/user_guide.html) and install the project dependencies using an existing python version with

   * [poetry](https://python-poetry.org/) (`poetry env use 3.12.10`, `poetry install`, `eval $(poetry env activate)`)
   * `python -m venv .venv`, `source .venv/bin/activate`, `pip install -r requirements.txt` (and the other requirements files if you need also the webserver and/or the test environment)

3. Run the application using:
   * `python my_ghost_writer/app.py` using the python app.py file path
   * `python -m ghost_writer.app.py` using the python module

### Run as a python module

If using the webserver with the module (`python -m ghost_writer.app.py`) it's necessary one of these env variables:

* `STATIC_FOLDER` to define a custom path for the static folder. Probably you should also download the static files:
  * `index.html`
  * `index.js`
  * `index.css`
* `API_MODE` to avoid mounting the static folder. This will define only the API endpoints
  * `/health`
  * `/health-mongo`
  * `/words-frequency`
  * `/thesaurus-wordsapi`

### Installation script

An alternate way to use the project is installing it using `install.sh`. e.g.

```bash
bash ./install.sh
```

If you want to run my custom frontend using this script (available on default on port 7860):

1. use the install-only option
2. define a custom path for `STATIC_FOLDER` and use it for the module execution:

```bash
# run the script with the install-only option
bash install.sh -i

# run the python module with the custom STATIC_FOLDER env variable, e.g.
# if you already created STATIC_FOLDER within the current directory with the needed files within, see above
export STATIC_FOLDER=$PWD/static
python -m my_ghost_writer.app
```

## Local mongodb needed for the thesaurus feature

To run a local mongodb instance on your local environment, you can use this docker command:

```
docker run --env=MONGO_MAJOR=8.0 --name mongo \
--env=HOME=/data/db --volume=${LOCAL_MONGO_FOLDER}:/data -p 27017:27017 \
--volume=/data/configdb --volume=/data/db --network=bridge --restart=always \
-d mongo:8-noble
```

## Docker

To build the project with docker:

```
DOCKER_VERSION=$(grep version pyproject.toml |head -1|cut -d'=' -f2|cut -d'"' -f2);
docker build . -f dockerfiles/dockerfile_my_ghost_writer_base  --progress=plain --tag registry.gitlab.com/aletrn/my_ghost_writer_base:${DOCKER_VERSION}
docker build . --progress=plain --tag registry.gitlab.com/aletrn/my_ghost_writer:${DOCKER_VERSION}
```

To run the docker container (you still need to configure the mongodb endpoint to use the single my_ghost_writer container):
```
docker run -d --name my_ghost_writer -p 7860:7860 -e ME_CONFIG_MONGODB_USE_OK=FALSE \
    -e ALLOWED_ORIGIN=http://localhost:7860,http://localhost:8000 \
    registry.gitlab.com/aletrn/my_ghost_writer:${DOCKER_VERSION}; docker logs -f my_ghost_writer
```

To source more than one env variable, you can use this command:
```
set -o allexport && source <(cat ./.env) && set +o allexport;
```

## Contributing

Pull requests are welcome! Please make sure to test your changes thoroughly before submitting a pull request.

This project is still in its early stages, and there are many features that can be added to make it more useful for writers.

If you have any suggestions or would like to contribute to the project, please don't hesitate to reach out!