An Experiment with Applying Attention to a PyTorch Regression Model on a Synthetic Dataset

The goal of a machine learning regression problem is to predict a single numeric value. Classical ML regression techniques include linear regression, Gaussian process regression, gradient boosting regression, and others.

I’ve been experimenting with a new algorithm for machine learning regression. I call the technique attention regression because it applies neural attention (used in natural language processing) to a standard deep neural regression system.

Before I started the experiment described in this post, I decided to update my existing PyTorch version 2.2.1+cpu to 2.3.1+cpu. What could go wrong?

I went to download.pytorch.org/whl/torch_stable.html and downloaded file torch-2.3.1+cpu-cp311-cp311-win_amd64.whl (because I’m using Python 3.11.5 in Anaconda Anaconda3-2023.09-0). But then sheesh, it took me a couple of hours to install. Apparently version 2.3 has a whole slew of new dependencies. In the end I was forced to use the somewhat risky command pip install –ignore-installed “torch-2.3.1+cpu-cp311-cp311-win_amd64.whl”. If you’re new to PyTorch, be aware that dealing with hundreds of library dependencies can be a nightmare.

For the experiment described in this post, I used a set of synthetic data that looks like:

-0.1660, 0.4406, -0.9998, -0.3953, -0.7065, -0.8153, 0.7022
-0.2065, 0.0776, -0.1616,  0.3704, -0.5911,  0.7562, 0.5666
-0.9452, 0.3409, -0.1654,  0.1174, -0.7192, -0.6038, 0.8186
 0.7528, 0.7892, -0.8299, -0.9219, -0.6603,  0.7563, 0.3687
. . .

The first six values on each line are the predictors. The last value on each line is the target y value to predict. The data was generated by a 6-10-1 neural network with random weights and biases. There are 200 training items and 40 test items.



This diagram shows a 6–12-PE-A-(6-6)-1 attention regression system, not the 6–12-PE-A-(10-10)-1 architecture of the demo program.


I put together a PyTorch attention regression model with architecture 6–12-PE-A-(10-10)-1. There are 6 input/predictor values. Each input value is mapped to a numeric pseudo-embedding of 2 values, giving 12 input values. Those 12 input values are augmented with a simplified form of positional encoding (as opposed to the complex positional encoding used in NLP). The encoded values go to a simplified numeric vector Attention layer (as opposed to a complex natural language processing multi-headed Attention layer). The output of the Attention layer goes to 2 fully-connected Dense layers with 10 nodes each that reduce the output to a single value.

The key parts of the demo output are:

Begin attention regression on synthetic data

Loading train (200) and test (40) data to memory
Done

First row of train data:
tensor([-0.1660,  0.4406, -0.9998, -0.3953, -0.7065, -0.8153])

First target y value:
tensor([0.7022])

Creating 6--12-PE-A-(10-10)-1 regression model

bat_size = 10
loss = MSELoss()
optimizer = Adam
lrn_rate = 0.001

Starting training
epoch =    0  |  loss = 0.7178
epoch =   20  |  loss = 0.0586
epoch =   40  |  loss = 0.0366
epoch =   60  |  loss = 0.0449
epoch =   80  |  loss = 0.0356
Done

Computing model accuracy (within 0.15 of true)
Accuracy on train data = 0.8550
Accuracy on test data = 0.8750

Predicting target y for train[0]:
(true y = 0.7022)

Predicted y = 0.7258

End demo

I implemented an accuracy() function that scores a predicted y value correct if it’s within 0.15 of the true target y value. The model’s accuracy result of 85.50% on the training data (171 out of 200 correct) and 87.50% on the test data (35 out of 40 correct) is reasonably good for this dataset. I ran the data through a scikit-learn library GradientBoostingRegressor model with 25 decision tree learners and remaining default parameters, and it scored 86.00% and 60.00% accuracy on the training and test data.

So, at this point in my investigation, I’m reasonably confident that attention regression works about as well as existing regression algorithms. But the whole point of the attention mechanism is to deal with data that has an inherent ordering of the predictor values, such as words in a sentence. To explore that idea, I’m going to need some new (probably synthetic) data.



My attention regression demo is primitive, but it demonstrates that the concept may be useful.

The USS Holland (SS-1) was the first submarine of the U.S. Navy. The Holland was designed by John Holland and commissioned in October 1900. It is considered the first modern submarine and featured innovations including a conning tower with view ports, a reloadable torpedo tube, and dual gas and electric engines. The only modern feature missing was a periscope, which was introduced by Holland competitor Simon Lake on his submarine Protector in 1902, but Protector was not accepted by the Navy. Although primitive, the USS Holland clearly demonstrated that submarines could be useful weapons of war. I built a plastic model of the Holland when I was a young man and I dreamt about submarines for many years.


Demo data. Replace “lt” (less than), “gt”, “lte”, “gte” with Boolean operator symbols. (My lame blog editor consistently mangles symbols).

# synthetic_attention.py
# regression with attention on a synthetic dataset 
# PyTorch 2.3.1-CPU  Anaconda3-2023.09  Python 3.11.5
# Windows 10/11 

import numpy as np
import torch as T  # non-standard alias

device = T.device('cpu')  # apply to Tensor or Module

# -----------------------------------------------------------

class SynthDataset(T.utils.data.Dataset):
  def __init__(self, src_file):
    tmp_x = np.loadtxt(src_file, delimiter=",",
      usecols=[0,1,2,3,4,5], dtype=np.float32)
    tmp_y = np.loadtxt(src_file, usecols=6, delimiter=",",
      dtype=np.float32)
    tmp_y = tmp_y.reshape(-1,1)  # 2D required

    self.x_data = T.tensor(tmp_x, dtype=T.float32).to(device)
    self.y_data = T.tensor(tmp_y, dtype=T.float32).to(device)

  def __len__(self):
    return len(self.x_data)

  def __getitem__(self, idx):
    preds = self.x_data[idx]
    trgts = self.y_data[idx] 
    return (preds, trgts)  # as a tuple

# -----------------------------------------------------------

class SkipLinear(T.nn.Module):
  # numeric pseudo-embedding
  # -----

  class Core(T.nn.Module):
    def __init__(self, n):
      super().__init__()
      # 1 node to n nodes, n gte 2
      self.weights = T.nn.Parameter(T.zeros((n,1),
        dtype=T.float32))
      self.biases = T.nn.Parameter(T.tensor(n,
        dtype=T.float32))
      lim = 0.01
      T.nn.init.uniform_(self.weights, -lim, lim)
      T.nn.init.zeros_(self.biases)

    def forward(self, x):
      wx= T.mm(x, self.weights.t())
      v = T.add(wx, self.biases)
      return v

  # -----

  def __init__(self, n_in, n_out):
    super().__init__()
    self.n_in = n_in; self.n_out = n_out
    if n_out  % n_in != 0:
      print("FATAL: n_out must be divisible by n_in")
    n = n_out // n_in  # num nodes per input

    self.lst_modules = \
      T.nn.ModuleList([SkipLinear.Core(n) for \
        i in range(n_in)])

  def forward(self, x):
    lst_nodes = []
    for i in range(self.n_in):
      xi = x[:,i].reshape(-1,1)
      oupt = self.lst_modules[i](xi)
      lst_nodes.append(oupt)
    result = T.cat((lst_nodes[0], lst_nodes[1]), 1)
    for i in range(2,self.n_in):
      result = T.cat((result, lst_nodes[i]), 1)
    result = result.reshape(-1, self.n_out)
    return result

# -----------------------------------------------------------

class PositionEncode(T.nn.Module):
  def __init__(self, n_features):
    super(PositionEncode, self).__init__()  # old syntax
    self.nf = n_features
    self.pe = T.zeros(n_features, dtype=T.float32)
    for i in range(n_features):
      self.pe[i] = i * (0.01 / n_features)  # no sin, cos

  def forward(self, x):
    for i in range(len(x)):
      for j in range(len(x[0])):
        x[i][j] += self.pe[j]
    return x

# -----------------------------------------------------------

class VectorAttention(T.nn.Module):
  def __init__(self, n_features):
    super(VectorAttention, self).__init__()
    self.nf = n_features
    self.Q = T.nn.Linear(n_features, n_features)
    self.K = T.nn.Linear(n_features, n_features)
    self.V = T.nn.Linear(n_features, n_features)
    self.O = T.nn.Linear(n_features, n_features)
    self.soft = T.nn.Softmax(dim=0)  # scale columns (?!)
   
  def forward(self, x): # x is (batch, seq)
    q = self.Q(x)
    k = self.K(x)
    v = self.V(x)

    scores = T.matmul(q, k.transpose(0,1))
    scores = scores / np.sqrt(self.nf)
    attn = self.soft(scores)
    z = T.matmul(attn, v)
    return self.O(z)  

# -----------------------------------------------------------

class AttentionNet(T.nn.Module):
  def __init__(self):
    super(AttentionNet, self).__init__()
    self.embed = SkipLinear(6, 12)  # 5 inputs, each to 2
    self.pos_enc = PositionEncode(12)
    self.att = VectorAttention(12)  # features/ dim
    self.hid1 = T.nn.Linear(12, 10)  # 12-(10-10)-1
    self.hid2 = T.nn.Linear(10, 10)
    self.oupt = T.nn.Linear(10, 1)

    # T.nn.init.xavier_uniform_(self.hid1.weight)  # glorot
    # T.nn.init.zeros_(self.hid1.bias)
    # T.nn.init.xavier_uniform_(self.hid2.weight)
    # T.nn.init.zeros_(self.hid2.bias)
    # T.nn.init.xavier_uniform_(self.oupt.weight)
    # T.nn.init.zeros_(self.oupt.bias)
    # use default initialization

  def forward(self, x):
    z = self.embed(x)
    z = self.pos_enc(z)
    z = self.att(z)  # 2D
    z = T.tanh(self.hid1(z))
    z = T.tanh(self.hid2(z))
    z = self.oupt(z)  # regression: no activation
    return z

# -----------------------------------------------------------

def train(model, ds, bs, lr, me, le):
  # dataset, bat_size, lrn_rate, max_epochs, log interval
  train_ldr = T.utils.data.DataLoader(ds, batch_size=bs,
    shuffle=True)
  loss_func = T.nn.MSELoss()
  optimizer = T.optim.Adam(model.parameters(), lr=lr)
  # optimizer = T.optim.SGD(model.parameters(), lr=lr)

  for epoch in range(0, me):
    epoch_loss = 0.0  # for one full epoch
    for (b_idx, batch) in enumerate(train_ldr):
      X = batch[0]  # predictors
      y = batch[1]  # target house price
      optimizer.zero_grad()
      oupt = model(X)
      loss_val = loss_func(oupt, y)  # a tensor
      epoch_loss += loss_val.item()  # accumulate
      loss_val.backward()  # compute gradients
      optimizer.step()     # update weights

    if epoch % le == 0:
      print("epoch = %4d  |  loss = %0.4f" % \
        (epoch, epoch_loss)) 

# -----------------------------------------------------------

def accuracy(model, ds, pct_close):
  # assumes model.eval()
  # correct within pct of true income
  n_correct = 0; n_wrong = 0

  for i in range(len(ds)):
    X = ds[i][0].reshape(1,-1)  # [1,8] 2D
    # print(X.shape); input()
    Y = ds[i][1]   # 2D
    with T.no_grad():
      oupt = model(X) 

    # print("predicted = "); print(oupt)
    # print("actual = "); print(Y)

    if T.abs(oupt - Y) "lt" T.abs(pct_close * Y):
      n_correct += 1; # print("correct")
    else:
      n_wrong += 1; # print("wrong")

  acc = (n_correct * 1.0) / (n_correct + n_wrong)
  return acc

# -----------------------------------------------------------

def main():
  # 0. get started
  print("\nBegin attention regression on synthetic data ")
  np.random.seed(0)
  T.manual_seed(0) 

  # 1. load data
  print("\nLoading train (200) and test (40) data to memory ")
  train_file = ".\\Data\\synthetic_train.txt"
  train_ds = SynthDataset(train_file)  # 200 rows
  test_file = ".\\Data\\synthetic_test.txt"
  test_ds = SynthDataset(test_file)    # 40 rows
  print("Done ")

  print("\nFirst row of train data: ")
  print(train_ds[0][0])
  print("\nFirst target y value: ")
  print(train_ds[0][1])

  # 2. create model
  print("\nCreating 6--12-PE-A-(10-10)-1 regression model ")
  net = AttentionNet().to(device)

  # 3. train model
  print("\nbat_size = 10 ")
  print("loss = MSELoss() ")
  print("optimizer = Adam ")
  print("lrn_rate = 0.001 ")

  print("\nStarting training")
  net.train()
  train(net, train_ds, bs=10, lr=0.001, me=100, le=20)
  print("Done ")

# -----------------------------------------------------------

  # 4. evaluate model accuracy
  net.eval()
  print("\nComputing model accuracy (within 0.15 of true) ")
  acc_train = accuracy(net, train_ds, 0.15)  # item-by-item
  print("Accuracy on train data = %0.4f" % acc_train)

  acc_test = accuracy(net, test_ds, 0.15) 
  print("Accuracy on test data = %0.4f" % acc_test)

# -----------------------------------------------------------

  # 5. make a prediction
  print("\nPredicting target y for train[0]: ")
  print("(true y = 0.7022) ")
  x = np.array([[-0.1660, 0.4406, -0.9998, -0.3953,
    -0.7065, -0.8153]], dtype=np.float32)  # 0.7022
  x = T.tensor(x, dtype=T.float32).to(device) 

  with T.no_grad():
    y = net(x)
  pred_raw = y.item()  # scalar
  print("\nPredicted y = %0.4f" % pred_raw)  

# -----------------------------------------------------------

  # 6. TODO: save model (state_dict approach)

  print("\nEnd demo ")

if __name__=="__main__":
  main()

Training data:

-0.1660, 0.4406,-0.9998,-0.3953,-0.7065,-0.8153, 0.7022
-0.2065, 0.0776,-0.1616, 0.3704,-0.5911, 0.7562, 0.5666
-0.9452, 0.3409,-0.1654, 0.1174,-0.7192,-0.6038, 0.8186
 0.7528, 0.7892,-0.8299,-0.9219,-0.6603, 0.7563, 0.3687
-0.8033,-0.1578, 0.9158, 0.0663, 0.3838,-0.3690, 0.7535
-0.9634, 0.5003, 0.9777, 0.4963,-0.4391, 0.5786, 0.7076
-0.7935,-0.1042, 0.8172,-0.4128,-0.4244,-0.7399, 0.8454
-0.0169,-0.8933, 0.1482,-0.7065, 0.1786, 0.3995, 0.7302
-0.7953,-0.1719, 0.3888,-0.1716,-0.9001, 0.0718, 0.8692
 0.8892, 0.1731, 0.8068,-0.7251,-0.7214, 0.6148, 0.4740
-0.2046,-0.6693, 0.8550,-0.3045, 0.5016, 0.4520, 0.6714
 0.5019,-0.3022,-0.4601, 0.7918,-0.1438, 0.9297, 0.4331
 0.3269, 0.2434,-0.7705, 0.8990,-0.1002, 0.1568, 0.3716
 0.8068, 0.1474,-0.9943, 0.2343,-0.3467, 0.0541, 0.3829
 0.7719,-0.2855, 0.8171, 0.2467,-0.9684, 0.8589, 0.4700
 0.8652, 0.3936,-0.8680, 0.5109, 0.5078, 0.8460, 0.2648
 0.4230,-0.7515,-0.9602,-0.9476,-0.9434,-0.5076, 0.8059
 0.1056, 0.6841,-0.7517,-0.4416, 0.1715, 0.9392, 0.3512
 0.1221,-0.9627, 0.6013,-0.5341, 0.6142,-0.2243, 0.6840
 0.1125,-0.7271,-0.8802,-0.7573,-0.9109,-0.7850, 0.8640
-0.5486, 0.4260, 0.1194,-0.9749,-0.8561, 0.9346, 0.6109
-0.4953, 0.4877,-0.6091, 0.1627, 0.9400, 0.6937, 0.3382
-0.5203,-0.0125, 0.2399, 0.6580,-0.6864,-0.9628, 0.7400
 0.2127, 0.1377,-0.3653, 0.9772, 0.1595,-0.2397, 0.4081
 0.1019, 0.4907, 0.3385,-0.4702,-0.8673,-0.2598, 0.6582
 0.5055,-0.8669,-0.4794, 0.6095,-0.6131, 0.2789, 0.6644
 0.0493, 0.8496,-0.4734,-0.8681, 0.4701, 0.5444, 0.3214
 0.9004, 0.1133, 0.8312, 0.2831,-0.2200,-0.0280, 0.3149
 0.2086, 0.0991, 0.8524, 0.8375,-0.2102, 0.9265, 0.3619
-0.7298, 0.0113,-0.9570, 0.8959, 0.6542,-0.9700, 0.6451
-0.6476,-0.3359,-0.7380, 0.6190,-0.3105, 0.8802, 0.6606
 0.6895, 0.8108,-0.0802, 0.0927, 0.5972,-0.4286, 0.2427
-0.0195, 0.1982,-0.9689, 0.1870,-0.1326, 0.6147, 0.4773
 0.1557,-0.6320, 0.5759, 0.2241,-0.8922,-0.1596, 0.7581
 0.3581, 0.8372,-0.9992, 0.9535,-0.2468, 0.9476, 0.2962
 0.1494, 0.2562,-0.4288, 0.1737, 0.5000, 0.7166, 0.3513
 0.5102, 0.3961, 0.7290,-0.3546, 0.3416,-0.0983, 0.3153
-0.1970,-0.3652, 0.2438,-0.1395, 0.9476, 0.3556, 0.4719
-0.6029,-0.1466,-0.3133, 0.5953, 0.7600, 0.8077, 0.3875
-0.4953, 0.7098, 0.0554, 0.6043, 0.1450, 0.4663, 0.4739
 0.0380, 0.5418, 0.1377,-0.0686,-0.3146,-0.8636, 0.6048
 0.9656,-0.6368, 0.6237, 0.7499, 0.3768, 0.1390, 0.3705
-0.6781,-0.0662,-0.3097,-0.5499, 0.1850,-0.3755, 0.7668
-0.6141,-0.0008, 0.4572,-0.5836,-0.5039, 0.7033, 0.7301
-0.1683, 0.2334,-0.5327,-0.7961, 0.0317,-0.0457, 0.5777
 0.0880, 0.3083,-0.7109, 0.5031,-0.5559, 0.0387, 0.5118
 0.5706,-0.9553,-0.3513, 0.7458, 0.6894, 0.0769, 0.4329
-0.8025, 0.3026, 0.4070, 0.2205, 0.5992,-0.9309, 0.7098
 0.5405, 0.4635,-0.4806,-0.4859, 0.2646,-0.3094, 0.3566
 0.5655, 0.9809,-0.3995,-0.7140, 0.8026, 0.0831, 0.2551
 0.9495, 0.2732, 0.9878, 0.0921, 0.0529,-0.7291, 0.3074
-0.6792, 0.4913,-0.9392,-0.2669, 0.7247, 0.3854, 0.4362
 0.3819,-0.6227,-0.1162, 0.1632, 0.9795,-0.5922, 0.4435
 0.5003,-0.0860,-0.8861, 0.0170,-0.5761, 0.5972, 0.5136
-0.4053,-0.9448, 0.1869, 0.6877,-0.2380, 0.4997, 0.7859
 0.9189, 0.6079,-0.9354, 0.4188,-0.0700, 0.8951, 0.2696
-0.5571,-0.4659,-0.8371,-0.1428,-0.7820, 0.2676, 0.8566
 0.5324,-0.3151, 0.6917,-0.1425, 0.6480, 0.2530, 0.4252
-0.7132,-0.8432,-0.9633,-0.8666,-0.0828,-0.7733, 0.9217
-0.0952,-0.0998,-0.0439,-0.0520, 0.6063,-0.1952, 0.5140
 0.8094,-0.9259, 0.5477,-0.7487, 0.2370,-0.9793, 0.5562
 0.9024, 0.8108, 0.5919, 0.8305,-0.7089,-0.6845, 0.2993
-0.6247, 0.2450, 0.8116, 0.9799, 0.4222, 0.4636, 0.4619
-0.5003,-0.6531,-0.7611, 0.6252,-0.7064,-0.4714, 0.8452
 0.6382,-0.3788, 0.9648,-0.4667, 0.0673,-0.3711, 0.5070
-0.1328, 0.0246, 0.8778,-0.9381, 0.4338, 0.7820, 0.5680
-0.9454, 0.0441,-0.3480, 0.7190, 0.1170, 0.3805, 0.6562
-0.4198,-0.9813, 0.1535,-0.3771, 0.0345, 0.8328, 0.7707
-0.1471,-0.5052,-0.2574, 0.8637, 0.8737, 0.6887, 0.3436
-0.3712,-0.6505, 0.2142,-0.1728, 0.6327,-0.6297, 0.7430
 0.4038,-0.5193, 0.1484,-0.3020,-0.8861,-0.5424, 0.7499
 0.0380,-0.6506, 0.1414, 0.9935, 0.6337, 0.1887, 0.4509
 0.9520, 0.8031, 0.1912,-0.9351,-0.8128,-0.8693, 0.5336
 0.9507,-0.6640, 0.9456, 0.5349, 0.6485, 0.2652, 0.3616
 0.3375,-0.0462,-0.9737,-0.2940,-0.0159, 0.4602, 0.4840
-0.7247,-0.9782, 0.5166,-0.3601, 0.9688,-0.5595, 0.7751
-0.3226, 0.0478, 0.5098,-0.0723,-0.7504,-0.3750, 0.8025
 0.5403,-0.7393,-0.9542, 0.0382, 0.6200,-0.9748, 0.5359
 0.3449, 0.3736,-0.1015, 0.8296, 0.2887,-0.9895, 0.4390
 0.6608, 0.2983, 0.3474, 0.1570,-0.4518, 0.1211, 0.3624
 0.3435,-0.2951, 0.7117,-0.6099, 0.4946,-0.4208, 0.5283
 0.6154,-0.2929,-0.5726, 0.5346,-0.3827, 0.4665, 0.4907
 0.4889,-0.5572,-0.5718,-0.6021,-0.7150,-0.2458, 0.7202
-0.8389,-0.5366,-0.5847, 0.8347, 0.4226, 0.1078, 0.6391
-0.3910, 0.6697,-0.1294, 0.8469, 0.4121,-0.0439, 0.4693
-0.1376,-0.1916,-0.7065, 0.4586,-0.6225, 0.2878, 0.6695
 0.5086,-0.5785, 0.2019, 0.4979, 0.2764, 0.1943, 0.4666
 0.8906,-0.1489, 0.5644,-0.8877, 0.6705,-0.6155, 0.3480
-0.2098,-0.3998,-0.8398, 0.8093,-0.2597, 0.0614, 0.6341
-0.5871,-0.8476, 0.0158,-0.4769,-0.2859,-0.7839, 0.9006
 0.5751,-0.7868, 0.9714,-0.6457, 0.1448,-0.9103, 0.6049
 0.0558, 0.4802,-0.7001, 0.1022,-0.5668, 0.5184, 0.4612
 0.4458,-0.6469, 0.7239,-0.9604, 0.7205, 0.1178, 0.5941
 0.4339, 0.9747,-0.4438,-0.9924, 0.8678, 0.7158, 0.2627
 0.4577, 0.0334, 0.4139, 0.5611,-0.2502, 0.5406, 0.3847
-0.1963, 0.3946,-0.9938, 0.5498, 0.7928,-0.5214, 0.5025
-0.7585,-0.5594,-0.3958, 0.7661, 0.0863,-0.4266, 0.7481
 0.2277,-0.3517,-0.0853,-0.1118, 0.6563,-0.1473, 0.4798
-0.3086, 0.3499,-0.5570,-0.0655,-0.3705, 0.2537, 0.5768
 0.5689,-0.0861, 0.3125,-0.7363,-0.1340, 0.8186, 0.5035
 0.2110, 0.5335, 0.0094,-0.0039, 0.6858,-0.8644, 0.4243
 0.0357,-0.6111, 0.6959,-0.4967, 0.4015, 0.0805, 0.6611
 0.8977, 0.2487, 0.6760,-0.9841, 0.9787,-0.8446, 0.2873
-0.9821, 0.6455, 0.7224,-0.1203,-0.4885, 0.6054, 0.6908
-0.0443,-0.7313, 0.8557, 0.7919,-0.0169, 0.7134, 0.6039
-0.2040, 0.0115,-0.6209, 0.9300,-0.4116,-0.7931, 0.6495
-0.7114,-0.9718, 0.4319, 0.1290, 0.5892, 0.0142, 0.7675
 0.5557,-0.1870, 0.2955,-0.6404,-0.3564,-0.6548, 0.6295
-0.1827,-0.5172,-0.1862, 0.9504,-0.3594, 0.9650, 0.5685
 0.7150, 0.2392,-0.4959, 0.5857,-0.1341,-0.2850, 0.3585
-0.3394, 0.3947,-0.4627, 0.6166,-0.4094, 0.0882, 0.5962
 0.7768,-0.6312, 0.1707, 0.7964,-0.1078, 0.8437, 0.4243
-0.4420, 0.2177, 0.3649,-0.5436,-0.9725,-0.1666, 0.8086
 0.5595,-0.6505,-0.3161,-0.7108, 0.4335, 0.3986, 0.5846
 0.3770,-0.4932, 0.3847,-0.5454,-0.1507,-0.2562, 0.6335
 0.2633, 0.4146, 0.2272, 0.2966,-0.6601,-0.7011, 0.5653
 0.0284, 0.7507,-0.6321,-0.0743,-0.1421,-0.0054, 0.4219
-0.4762, 0.6891, 0.6007,-0.1467, 0.2140,-0.7091, 0.6098
 0.0192,-0.4061, 0.7193, 0.3432, 0.2669,-0.7505, 0.6549
 0.8966, 0.2902,-0.6966, 0.2783, 0.1313,-0.0627, 0.2876
-0.1439, 0.1985, 0.6999, 0.5022, 0.1587, 0.8494, 0.3872
 0.2473,-0.9040,-0.4308,-0.8779, 0.4070, 0.3369, 0.6825
-0.2428,-0.6236, 0.4940,-0.3192, 0.5906,-0.0242, 0.6770
 0.2885,-0.2987,-0.5416,-0.1322,-0.2351,-0.0604, 0.6106
 0.9590,-0.2712, 0.5488, 0.1055, 0.7783,-0.2901, 0.2956
-0.9129, 0.9015, 0.1128,-0.2473, 0.9901,-0.8833, 0.6500
 0.0334,-0.9378, 0.1424,-0.6391, 0.2619, 0.9618, 0.7033
 0.4169, 0.5549,-0.0103, 0.0571,-0.6984,-0.2612, 0.4935
-0.7156, 0.4538,-0.0460,-0.1022, 0.7720, 0.0552, 0.4983
-0.8560,-0.1637,-0.9485,-0.4177, 0.0070, 0.9319, 0.6445
-0.7812, 0.3461,-0.0001, 0.5542,-0.7128,-0.8336, 0.7720
-0.6166, 0.5356,-0.4194,-0.5662,-0.9666,-0.2027, 0.7401
-0.2378, 0.3187,-0.8582,-0.6948,-0.9668,-0.7724, 0.7670
-0.3579, 0.1158, 0.9869, 0.6690, 0.3992, 0.8365, 0.4184
-0.9205,-0.8593,-0.0520,-0.3017, 0.8745,-0.0209, 0.7723
-0.1067, 0.7541,-0.4928,-0.4524,-0.3433, 0.0951, 0.4645
-0.5597, 0.3429,-0.7144,-0.8118, 0.7404,-0.5263, 0.6117
 0.0516,-0.8480, 0.7483, 0.9023, 0.6250,-0.4324, 0.5987
 0.0557,-0.3212, 0.1093, 0.9488,-0.3766, 0.3376, 0.5739
-0.3484, 0.7797, 0.5034, 0.5253,-0.0610,-0.5785, 0.5365
-0.9170,-0.3563,-0.9258, 0.3877, 0.3407,-0.1391, 0.7131
-0.9203,-0.7304,-0.6132,-0.3287,-0.8954, 0.2102, 0.9329
 0.0241, 0.2349,-0.1353, 0.6954,-0.0919,-0.9692, 0.5744
 0.6460, 0.9036,-0.8982,-0.5299,-0.8733,-0.1567, 0.4425
 0.7277,-0.8368,-0.0538,-0.7489, 0.5458, 0.6828, 0.5848
-0.5212, 0.9049, 0.8878, 0.2279, 0.9470,-0.3103, 0.4255
 0.7957,-0.1308,-0.5284, 0.8817, 0.3684,-0.8702, 0.3969
 0.2099, 0.4647,-0.4931, 0.2010, 0.6292,-0.8918, 0.4620
-0.7390, 0.6849, 0.2367, 0.0626,-0.5034,-0.4098, 0.7137
-0.8711, 0.7940,-0.5932, 0.6525, 0.7635,-0.0265, 0.5705
 0.1969, 0.0545, 0.2496, 0.7101,-0.4357, 0.7675, 0.4242
-0.5460, 0.1920,-0.5211,-0.7372,-0.6763, 0.6897, 0.6769
 0.2044, 0.9271,-0.3086, 0.1913, 0.1980, 0.2314, 0.2998
-0.6149, 0.5059,-0.9854,-0.3435, 0.8352, 0.1767, 0.4497
 0.7104, 0.2093, 0.6452, 0.7590,-0.3580,-0.7541, 0.4076
-0.7465, 0.1796,-0.9279,-0.5996, 0.5766,-0.9758, 0.7713
-0.3933,-0.9572, 0.9950, 0.1641,-0.4132, 0.8579, 0.7421
 0.1757,-0.4717,-0.3894,-0.2567,-0.5111, 0.1691, 0.7088
 0.3917,-0.8561, 0.9422, 0.5061, 0.6123, 0.5033, 0.4824
-0.1087, 0.3449,-0.1025, 0.4086, 0.3633, 0.3943, 0.3760
 0.2372,-0.6980, 0.5216, 0.5621, 0.8082,-0.5325, 0.5297
-0.3589, 0.6310, 0.2271, 0.5200,-0.1447,-0.8011, 0.5903
-0.7699,-0.2532,-0.6123, 0.6415, 0.1993, 0.3777, 0.6039
-0.5298,-0.0768,-0.6028,-0.9490, 0.4588, 0.4498, 0.6159
-0.3392, 0.6870,-0.1431, 0.7294, 0.3141, 0.1621, 0.4501
 0.7889,-0.3900, 0.7419, 0.8175,-0.3403, 0.3661, 0.4087
 0.7984,-0.8486, 0.7572,-0.6183, 0.6995, 0.3342, 0.5025
 0.2707, 0.6956, 0.6437, 0.2565, 0.9126, 0.1798, 0.2331
-0.6043,-0.1413,-0.3265, 0.9839,-0.2395, 0.9854, 0.5444
-0.8509,-0.2594,-0.7532, 0.2690,-0.1722, 0.9818, 0.6516
 0.8599,-0.7015,-0.2102,-0.0768, 0.1219, 0.5607, 0.4747
-0.4760, 0.8216,-0.9555, 0.6422,-0.6231, 0.3715, 0.5485
-0.2896, 0.9484,-0.7545,-0.6249, 0.7789, 0.1668, 0.3415
-0.5931, 0.7926, 0.7462, 0.4006,-0.0590, 0.6543, 0.4781
-0.0083,-0.2730,-0.4488, 0.8495,-0.2260,-0.0142, 0.5854
-0.2335,-0.4049, 0.4352,-0.6183,-0.7636, 0.6740, 0.7596
 0.4883, 0.1810,-0.5142, 0.2465, 0.2767,-0.3449, 0.3995
-0.4922, 0.1828,-0.1424,-0.2358,-0.7466,-0.5115, 0.7968
-0.8413,-0.3943, 0.4834, 0.2300, 0.3448,-0.9832, 0.7989
-0.5382,-0.6502,-0.6300, 0.6885, 0.9652, 0.8275, 0.4353
-0.3053, 0.5604, 0.0929, 0.6329,-0.0325, 0.1799, 0.4848
 0.0740,-0.2680, 0.2086, 0.9176,-0.2144,-0.2141, 0.5856
 0.5813, 0.2902,-0.2122, 0.3779,-0.1920,-0.7278, 0.4079
-0.5641, 0.8515, 0.3793, 0.1976, 0.4933, 0.0839, 0.4716
 0.4011, 0.8611, 0.7252,-0.6651,-0.4737,-0.8568, 0.5708
-0.5785, 0.0056,-0.7901,-0.2223, 0.0760,-0.3216, 0.7252
 0.1118, 0.0735,-0.2188, 0.3925, 0.3570, 0.3746, 0.3688
 0.2262, 0.8715, 0.1938, 0.9592,-0.1180, 0.4792, 0.2952
-0.9248, 0.5295, 0.0366,-0.9894,-0.4456, 0.0697, 0.7335
 0.2992, 0.8629,-0.8505,-0.4464, 0.8385, 0.5300, 0.2702
 0.1995, 0.6659, 0.7921, 0.9454, 0.9970,-0.7207, 0.2996
-0.3066,-0.2927,-0.4923, 0.8220, 0.4513,-0.9481, 0.6617
-0.0770,-0.4374,-0.9421, 0.7694, 0.5420,-0.3405, 0.5131
-0.3842, 0.8562, 0.9538, 0.0471, 0.9039, 0.7760, 0.3215
 0.0361,-0.2545, 0.4207,-0.0887, 0.2104, 0.9808, 0.5202
-0.8220,-0.6302, 0.0537,-0.1658, 0.6013, 0.8664, 0.6598
-0.6443, 0.7201, 0.9148, 0.9189,-0.9243,-0.8848, 0.6095
-0.2880, 0.9074,-0.0461,-0.4435, 0.0060, 0.2867, 0.4025
-0.7775, 0.5161, 0.7039, 0.6885, 0.7810,-0.2363, 0.5234
-0.5484, 0.9426,-0.4308, 0.8148, 0.7811, 0.8450, 0.3479

Test data:

-0.6877, 0.7594, 0.2640,-0.5787,-0.3098,-0.6802, 0.7071
-0.6694,-0.6056, 0.3821, 0.1476, 0.7466,-0.5107, 0.7282
 0.2592,-0.9311, 0.0324, 0.7265, 0.9683,-0.9803, 0.5832
-0.9049,-0.9797,-0.0196,-0.9090,-0.4433, 0.2799, 0.9018
-0.4106,-0.4607, 0.1811,-0.2389, 0.4050,-0.0078, 0.6916
-0.4259,-0.7336, 0.8742, 0.6097, 0.8761,-0.6292, 0.6728
 0.8663, 0.8715,-0.4329,-0.4507, 0.1029,-0.6294, 0.2936
 0.8948,-0.0124, 0.9278, 0.2899,-0.0314, 0.9354, 0.3160
-0.7136, 0.2647, 0.3238,-0.1323,-0.8813,-0.0146, 0.8133
-0.4867,-0.2171,-0.5197, 0.3729, 0.9798,-0.6451, 0.5820
 0.6429,-0.5380,-0.8840,-0.7224, 0.8703, 0.7771, 0.5777
 0.6999,-0.1307,-0.0639, 0.2597,-0.6839,-0.9704, 0.5796
-0.4690,-0.9691, 0.3490, 0.1029,-0.3567, 0.5604, 0.8151
-0.4154,-0.6081,-0.8241, 0.7400,-0.8236, 0.3674, 0.7881
-0.7592,-0.9786, 0.1145, 0.8142, 0.7209,-0.3231, 0.6968
 0.3393, 0.6156, 0.7950,-0.0923, 0.1157, 0.0123, 0.3229
 0.3840, 0.3658, 0.0406, 0.6569, 0.0116, 0.6497, 0.2879
 0.9397, 0.4839,-0.4804, 0.1625, 0.9105,-0.8385, 0.2410
-0.8329, 0.2383,-0.5510, 0.5304, 0.1363, 0.3324, 0.5862
-0.8255,-0.2579, 0.3443,-0.6208, 0.7915, 0.8997, 0.6109
 0.9231, 0.4602,-0.1874, 0.4875,-0.4240,-0.3712, 0.3165
 0.7573,-0.4908, 0.5324, 0.8820,-0.9979,-0.0478, 0.6093
 0.3141, 0.6866,-0.6325, 0.7123,-0.2713, 0.7845, 0.3050
-0.1647,-0.6616, 0.2998,-0.9260,-0.3768,-0.3530, 0.8315
 0.2149, 0.3017, 0.6921, 0.8552, 0.3209, 0.1563, 0.3157
-0.6918, 0.7902,-0.3780, 0.0970, 0.3641,-0.5271, 0.6323
-0.6645, 0.0170, 0.5837, 0.3848,-0.7621, 0.8015, 0.7440
 0.1069,-0.8304,-0.5951, 0.7085, 0.4119, 0.7899, 0.4998
-0.3417, 0.0560, 0.3008, 0.1886,-0.5371,-0.1464, 0.7339
 0.9734,-0.8669, 0.4279,-0.3398, 0.2509,-0.4837, 0.4665
 0.3020,-0.2577,-0.4104, 0.8235, 0.8850, 0.2271, 0.3066
-0.5766, 0.6603,-0.5198, 0.2632, 0.4215, 0.4848, 0.4478
-0.2195, 0.5197, 0.8059, 0.1748,-0.8192,-0.7420, 0.6740
-0.9212,-0.5169, 0.7581, 0.9470, 0.2108, 0.9525, 0.6180
-0.9131, 0.8971,-0.3774, 0.5979, 0.6213, 0.7200, 0.4642
-0.4842, 0.8689, 0.2382, 0.9709,-0.9347, 0.4503, 0.5662
 0.1311,-0.0152,-0.4816,-0.3463,-0.5011,-0.5615, 0.6979
-0.8336, 0.5540, 0.0673, 0.4788, 0.0308,-0.2001, 0.6917
 0.9725,-0.9435, 0.8655, 0.8617,-0.2182,-0.5711, 0.6021
 0.6064,-0.4921,-0.4184, 0.8318, 0.8058, 0.0708, 0.3221

Program that generated the synthetic data:

# make_synthetic.py
# create synthetic train and test datasets for regression
# PyTorch 2.0.1-CPU  Anaconda3-2022.10  Python 3.9.13
# Windows 10/11 

import numpy as np
import torch as T  # non-standard alias
device = T.device('cpu')  # apply to Tensor or Module

# -----------------------------------------------------------

class Net(T.nn.Module):
  def __init__(self, n_in):
    super(Net, self).__init__()
    self.hid1 = T.nn.Linear(n_in, 10)  # n-(10)-1
    self.oupt = T.nn.Linear(10, 1)
  
    lim = 0.80
    T.nn.init.uniform_(self.hid1.weight, -lim, lim) 
    T.nn.init.uniform_(self.hid1.bias, -lim, lim)
    T.nn.init.uniform_(self.oupt.weight, -lim, lim) 
    T.nn.init.uniform_(self.oupt.bias, -lim, lim)

  def forward(self, x):
    z = T.tanh(self.hid1(x))
    z = T.sigmoid(self.oupt(z))  # oupt in [0.0, 1.0]
    return z

# -----------------------------------------------------------

def create_data_file(net, n_in, fn, n_items):
  f = open(fn, "w")
  x_lo = -1.0; x_hi = 1.0
  for i in range(n_items):
    s = ""
    X = (x_hi - x_lo) * np.random.random(size=(1,n_in)) + x_lo
    for j in range(n_in):
      s += ("%7.4f" % X[0][j]) + ","
    X = T.tensor(X, dtype=T.float32)

    with T.no_grad():
      y = net(X).item()
    # add noise
    y += np.random.normal(loc=0.0, scale=0.01)
    # make sure y is in range
    if y "lt" 0.0: y = 0.01 * np.random.random() + 0.01 # pos
    s += ("%7.4f" % y) + "\n"

    f.write(s)
  f.close()

# -----------------------------------------------------------

def main():
  # 0. get started
  print("\nCreating synthetic datasets for regression ")
  np.random.seed(1)
  T.manual_seed(1) 

  # 1. create neural generator model
  n_in = 6
  print("\nCreating n-(10)-1 regression model ")
  net = Net(n_in).to(device)

  # 2. use model to create data
  print("\nCreating data files ")
  create_data_file(net, n_in, ".\\synthetic_train.txt", 200)
  create_data_file(net, n_in, ".\\synthetic_test.txt", 40)

  print("\nEnd create synthetic data ")

# -----------------------------------------------------------

if __name__=="__main__":
  main()

The scikit GradientBoostingRegressor program:

# wind_scikit.py

import numpy as np
from sklearn.ensemble import GradientBoostingRegressor

def accuracy(model, data_X, data_y, pct_close):
  n_correct = 0; n_wrong = 0

  for i in range(len(data_X)):
    x = data_X[i].reshape(1, -1)
    y = data_y[i]  # actual
    y_pred = model.predict(x)

    if np.abs(y_pred - y) "lt" np.abs(pct_close * y):
      n_correct += 1
    else:
      n_wrong += 1
  acc = (n_correct * 1.0) / (n_correct + n_wrong)
  return acc

# -----------------------------------------------------------

train_file = ".\\Data\\synthetic_train.txt"
test_file = ".\\Data\\synthetic_test.txt"

train_XY = np.loadtxt(train_file,
 usecols=[0,1,2,3,4,5,6],
 delimiter=",", comments="#", dtype=np.float32)

train_X = train_XY[:,0:6]
train_y = train_XY[:,6]

test_XY = np.loadtxt(test_file,
 usecols=[0,1,2,3,4,5,6],
 delimiter=",", comments="#", dtype=np.float32)

test_X = test_XY[:,0:6]
test_y = test_XY[:,6]

gbr = GradientBoostingRegressor(n_estimators=25, random_state=1)
gbr.fit(train_X, train_y)

acc_train = accuracy(gbr, train_X, train_y, 0.15)
print("\nAccuracy on train = %0.4f " % acc_train)

acc_test = accuracy(gbr, test_X, test_y, 0.15)
print("Accuracy on test = %0.4f " % acc_test)
This entry was posted in PyTorch, Transformers. Bookmark the permalink.

Leave a Reply