YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

descripcion del modelo

Modelo gpt-2_Neo125M Fine Tune, para la prediccion de precios de casas o apartamentos en Cali-Colombia Descargue todos los archivos requeridos desde Dropbox.

  • Developed by: Nicolai Potes
  • Language: Python
  • Finetuned from model : gpt-neo-125M.

Training Details

  Num examples = 779
  Num Epochs = 500
  Instantaneous batch size per device = 80
  Total train batch size (w. parallel, distributed & accumulation) = 80
  Gradient Accumulation steps = 1
  Total optimization steps = 5000
  Number of trainable parameters = 125200128

Training Evaluate

 {'eval_loss': 1.341125726699829,
 'eval_runtime': 23.3347,
 'eval_samples_per_second': 300.111,
 'eval_steps_per_second': 3.771,
 'epoch': 500.0}

Training Data

datos sacados de https://www.metrocuadrado.com/ formato para el entrenamiento del mododelo

 'meter: 3651685 \n area: 267 \n bathroom: 4 \n room: 4 \n property: 1 \n price: 975000000',
 'meter: 3206498 \n area: 70 \n bathroom: 3 \n room: 4 \n property: 2 \n price: 225000000',
 'meter: 2181818 \n area: 110 \n bathroom: 2 \n room: 3 \n property: 2 \n price: 240000000',
 'meter: 5882352 \n area: 306 \n bathroom: 4 \n room: 4 \n property: 2 \n price: 1800000000',
 'meter: 2827586 \n area: 58 \n bathroom: 2 \n room: 2 \n property: 2 \n price: 164000000',
 'meter: 7382550 \n area: 149 \n bathroom: 4 \n room: 3 \n property: 2 \n price: 1100000000',
 'meter: 2833333 \n area: 300 \n bathroom: 3 \n room: 3 \n property: 1 \n price: 850000000',
 'meter: 3678474 \n area: 73 \n bathroom: 2 \n room: 3 \n property: 2 \n price: 270000000',
 'meter: 2254901 \n area: 51 \n bathroom: 2 \n room: 2 \n property: 2 \n price: 115000000',
 'meter: 2500000 \n area: 90 \n bathroom: 3 \n room: 3 \n property: 2 \n price: 225000000',
 'meter: 4508196 \n area: 122 \n bathroom: 5 \n room: 4 \n property: 2 \n price: 550000000',
 'meter: 3489583 \n area: 96 \n bathroom: 3 \n room: 3 \n property: 2 \n price: 335000000',
 'meter: 2151898 \n area: 395 \n bathroom: 5 \n room: 5 \n property: 1 \n price: 850000000',

Hardware GPU

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03    Driver Version: 510.47.03    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   49C    P0    29W /  70W |      0MiB / 15360MiB |      5%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Librerias requeridas

pip install transformers 
pip install torch

Codigo Python para cargar el modelo y predecir el valor de una propiedad (Casa/Apartamento)

import pandas as pd
import torch
import transformers
import re


'''
#dado el caso de estar en Google colab
from google.colab import drive
drive.mount('/content/drive')
path="/content/drive/My Drive/DatosMetroCuadradoPrueba/"
'''

path="[Direccion de la carpeta donde tiene el MODELO]"

path_carga= path+"modeloEntrenadoPreciosCasasApartamentos"

from transformers import GPT2Tokenizer, GPTNeoForCausalLM
new_modelPredict = GPTNeoForCausalLM.from_pretrained(path_carga).cuda()
tokenizer2 = GPT2Tokenizer.from_pretrained(path_carga)
new_modelPredict.resize_token_embeddings(len(tokenizer2))

tipo_propiedad= 1 # 1: casa , 2:apartamento
habitaciones= 5
baños= 5
area= 580 
valor_inmueble= 1500000000
valorMetroCuadrado= int(valor_inmueble/area)

propiedad = f"<|startoftext|>meter: {valorMetroCuadrado} \n area: {area} \n bathroom: {baños} \n room: {habitaciones} \n property: {tipo_propiedad} \n price:"
print("Texto:",propiedad)

generated = tokenizer2(propiedad,    #  <|pad|>
                      return_tensors="pt").input_ids.cuda()
sample_outputs = new_modelPredict.generate(generated, 
                 do_sample=True, 
                 top_k=50, 
                 max_length=100, 
                 num_beams=7, #3
                 top_p=1.65, 
                 
                 temperature=.69,
                 num_return_sequences=1,
                 pad_token_id = 0)

price= []
# 
for i, sample_output in enumerate(sample_outputs): 
  text= tokenizer2.decode(sample_output, skip_special_tokens=True)  
  num= text.split("\n")[-1].split("price: ")[1]
  try:    
    num= re.sub(r'[^\d.]', '',num )#[0]    
    price.append( num )    
  except:    
    pass
# pd.set_option('display.float_format', '{.2f}'.format)
priceData2= pd.DataFrame(price,columns=['price']).astype(int)
print(priceData2)


Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.