Added linear regression(+docs+tests) and fixed some bugs

This commit is contained in:
Marco Cetica 2024-03-27 11:57:52 +01:00
parent 89b7825885
commit a8bbeece41
Signed by: marco
GPG Key ID: 45060A949E90D0FD
12 changed files with 273 additions and 12 deletions

View File

@ -24,6 +24,7 @@ Some of the supported features are:
- Basic arithmetical operations(`+`, `-`, `*`, `/`, `^`, `%`);
- Scientific notation support(`5e3` -> `5000`);
- Trigonometrical functions(`sin`, `cos`, `tan`, `asin`, `acos`, `atan`);
- Statistical functions(permutations, combinations, summation, sum of squares, mean, standard deviation, linear regression);
- Base conversion(binary: `pb`, octal: `po`, hexadecimal: `px`);
- Factorial and constants(`!`, `pi`, `e`);
- Random number generator(`@`);
@ -285,6 +286,14 @@ lV 10 M FF { 2 :A # Red
[ [ RGB( ] P 2 ;A P lc p. 1 ;A P lc p. 0 ;A P [ ) ] p. [ = ] p. lV ph ] x
```
19. Find the mean of the following temperatures(Celsius): `[25, 15, 9.5, 10, 20, 16, 20]`:
```
4 k
25 15 9.5 10 20 16 20
SX SX SX SX SX SX SX
gM p # Prints 16.5000
```
## License
[GPLv3](https://choosealicense.com/licenses/gpl-3.0/)

48
man.md
View File

@ -3,7 +3,7 @@ title: dc
section: 1
header: General Commands Manual
footer: Marco Cetica
date: March 26, 2024
date: March 27, 2024
---
@ -225,9 +225,10 @@ Pops one value, computes its `acos`, and pushes that.
Pops one value, computes its `atan`, and pushes that.
## Statistics Operations
**dc** supports various common statistics operations, such as permutations, combinations, mean, standard deviation
summation, sum of squares and linear regression. All statistics functions are limited to **non-negative integers**.
Accumulating functions use the `X` register.
**dc** supports various common statistics operations, such as permutations, combinations, mean, standard deviation,
summation, sum of squares and linear regression. Most statistics functions are limited to **non-negative integers**.
The accumulating functions(such as the mean, the standard deviation and the linear regression) use the `X` and
the `Y` registers.
**gP**
@ -241,10 +242,6 @@ Pops two non-negative integers(that is, >=0) and computes `C_{y, x}`, that is th
`y` different items taken in quantities of `x` items at a time. No item shall occur more than once in a set and different orders of the same `x`
items are not counted separately. The `y` parameter correspond to the second one value popped while the `x` parameter correspond to the first one popped.
**gN**
Counts the number of accumulated elements of the `X` register's stack and pushes that.
**gs**
Computes Σx of the `X` register's stack and pushes that.
@ -261,6 +258,31 @@ Computes x̄(mean) of the `X` register's stack and pushes that.
Computes σ(standard deviation) of the `X` register's stack and pushes that.
**gL**
Computes linear regression of the `X` register's stack and the `Y` register's stack. Linear regression is a
simple statistical model to find a relationship beetween a *dependent variable*(`Y`) and an independent
variable(`X`). This function will compute the following linear equation:
$$
y = mx + b
$$
using the following formulae:
```
( n * ∑(x_i * y_i) ) - ( ∑x_i * ∑y_i )
m = ---------------------------------------
n * ∑x^2 - (∑x)^2
∑y_i - (m * ∑x_i)
b = ---------------------------------------
n
```
Where **n** is the number of elements of each set.
The results - the _slope_ **m** and the _y-intercept_ **b** - will be pushed onto the stack in that order.
## Base Conversion
**pb**
@ -629,10 +651,18 @@ lV 10 M FF { 2 :A # Red
[ [ RGB( ] P 2 ;A P lc p. 1 ;A P lc p. 0 ;A P [ ) ] p. [ = ] p. lV ph ] x
```
15. Find the mean of the following temperatures(Celsius): `[25, 15, 9.5, 10, 20, 16, 20]`:
```
4 k
25 15 9.5 10 20 16 20
SX SX SX SX SX SX SX
gM p # Prints 16.5000
```
# AUTHORS
The original version of the **dc** command was written by Robert Morris and Lorinda Cherry.
This version of **dc** is developed by Marco Cetica.
# BUGS
If you encounter any kind of problem, email me at [email@marcocetica.com](mailto:email@marcocetica.com) or open an issue at [https://github.com/ice-bit/dc](https://github.com/ice-bit/dc).
If you encounter any kind of problem, email me at [email@marcocetica.com](mailto:email@marcocetica.com) or open an issue at [https://github.com/ceticamarco/dc](https://github.com/ceticamarco/dc).

View File

@ -52,6 +52,7 @@ void Evaluate::init_environment() {
this->op_factory.emplace("gS", MAKE_UNIQUE_PTR(Statistics, OPType::SUMXX));
this->op_factory.emplace("gM", MAKE_UNIQUE_PTR(Statistics, OPType::MEAN));
this->op_factory.emplace("gD", MAKE_UNIQUE_PTR(Statistics, OPType::SDEV));
this->op_factory.emplace("gL", MAKE_UNIQUE_PTR(Statistics, OPType::LREG));
// Bitwise operations
this->op_factory.emplace("{", MAKE_UNIQUE_PTR(Bitwise, OPType::BAND));
this->op_factory.emplace("}", MAKE_UNIQUE_PTR(Bitwise, OPType::BOR));

View File

@ -15,6 +15,7 @@ std::optional<std::string> Statistics::exec(dc::Stack<std::string> &stack, dc::P
case OPType::SUMXX: err = fn_sum_squared(stack, parameters, regs); break;
case OPType::MEAN: err = fn_mean(stack, parameters, regs); break;
case OPType::SDEV: err = fn_sdev(stack, parameters, regs); break;
case OPType::LREG: err = fn_lreg(stack, parameters, regs); break;
default: break;
}
@ -51,7 +52,7 @@ std::optional<std::string> Statistics::fn_perm(dc::Stack<std::string> &stack, co
unsigned long long permutation = numerator_opt.value() / denominator_opt.value();
stack.push(trim_digits(permutation, parameters.precision));
} else {
return "'gP' requires integers values";
return "'gP' requires integer values";
}
return std::nullopt;
@ -90,7 +91,7 @@ std::optional<std::string> Statistics::fn_comb(dc::Stack<std::string> &stack, co
stack.push(trim_digits(combination, parameters.precision));
} else {
return "'gC' requires integers values";
return "'gC' requires integer values";
}
return std::nullopt;
@ -176,7 +177,7 @@ std::optional<std::string> Statistics::fn_sdev(dc::Stack<std::string> &stack, co
return acc + std::pow(deviation, 2);
});
// Then compute the mean of previos values(variance)
auto variance = sum_of_deviations / count;
auto variance = sum_of_deviations / (count-1);
// Finally, compute the square root of the variance(standard deviation)
auto s_dev = sqrt(variance);
@ -185,6 +186,67 @@ std::optional<std::string> Statistics::fn_sdev(dc::Stack<std::string> &stack, co
return std::nullopt;
}
std::optional<std::string> Statistics::fn_lreg(dc::Stack<std::string> &stack, const dc::Parameters &parameters, std::unordered_map<char, dc::Register> &regs) {
// Check whether 'x' register exists
if(regs.find('X') == regs.end()) {
return "Register 'X' is undefined";
}
// Check if register's stack is empty
if(regs['X'].stack.empty()) {
return "The stack of register 'X' is empty";
}
// Check whether 'y' register exists
if(regs.find('Y') == regs.end()) {
return "Register 'Y' is undefined";
}
// Check if register's stack is empty
if(regs['Y'].stack.empty()) {
return "The stack of register 'Y' is empty";
}
// Check that both registers have the same length
if(regs['X'].stack.size() != regs['Y'].stack.size()) {
return "'X' and 'Y' registers must be of the same length";
}
// Othwerise, retrieve count and summations of both sets
auto count = regs['X'].stack.size();
auto x_sum = regs['X'].stack.summation();
auto y_sum = regs['Y'].stack.summation();
// Then compute the sum of products
const auto& x_ref = regs['X'].stack.get_ref();
const auto& y_ref = regs['Y'].stack.get_ref();
std::size_t idx = 0;
double sum_of_products = 0.0;
for(auto it : x_ref) {
auto x = std::stod(it);
auto y = std::stod(y_ref[idx++]);
sum_of_products += (x * y);
}
// Then compute the sum of squares
auto x_sum_squares = regs['X'].stack.summation_squared();
// Then compute the slope of the line(m)
auto slope_numerator = ((count * sum_of_products) - (x_sum * y_sum));
auto slope_denominator = ((count * x_sum_squares) - std::pow(x_sum, 2));
auto slope = slope_numerator / slope_denominator;
// Then compute the intercept of the line(b)
auto intercept = (y_sum - (slope * x_sum)) / count;
// Finally push the slope and the intercept(in this order) into the main stack
stack.push(trim_digits(slope, parameters.precision));
stack.push(trim_digits(intercept, parameters.precision));
return std::nullopt;
}
std::optional<unsigned long long> Statistics::factorial(const long long n) {
if(n < 0) {
return std::nullopt;

View File

@ -14,6 +14,7 @@ private:
std::optional<std::string> fn_sum_squared(dc::Stack<std::string> &stack, const dc::Parameters &parameters, std::unordered_map<char, dc::Register> &regs);
std::optional<std::string> fn_mean(dc::Stack<std::string> &stack, const dc::Parameters &parameters, std::unordered_map<char, dc::Register> &regs);
std::optional<std::string> fn_sdev(dc::Stack<std::string> &stack, const dc::Parameters &parameters, std::unordered_map<char, dc::Register> &regs);
std::optional<std::string> fn_lreg(dc::Stack<std::string> &stack, const dc::Parameters &parameters, std::unordered_map<char, dc::Register> &regs);
std::optional<unsigned long long> factorial(const long long n);
std::string trim_digits(double number, unsigned int precision);

20
tests/test_comb Normal file
View File

@ -0,0 +1,20 @@
#!/bin/sh
utest() {
PROGRAM="$PWD/build/dc"
EXPECTED="276"
ACTUAL=$("$PROGRAM" -e '24 2 gC p')
assert_eq "$EXPECTED" "$ACTUAL"
# Test empty stack
EXPECTED="'gC' requires two operands"
ACTUAL=$("$PROGRAM" -e 'gC' 2>&1) || true
assert_eq "$EXPECTED" "$ACTUAL"
# Test non numerical values
EXPECTED="'gC' requires integer values"
ACTUAL=$("$PROGRAM" -e '[ foo ] 5 gC' 2>&1) || true
assert_eq "$EXPECTED" "$ACTUAL"
}
# vim: ts=4 sw=4 softtabstop=4 expandtab:

20
tests/test_perm Normal file
View File

@ -0,0 +1,20 @@
#!/bin/sh
utest() {
PROGRAM="$PWD/build/dc"
EXPECTED="2520"
ACTUAL=$("$PROGRAM" -e '7 5 gP p')
assert_eq "$EXPECTED" "$ACTUAL"
# Test empty stack
EXPECTED="'gP' requires two operands"
ACTUAL=$("$PROGRAM" -e 'gP' 2>&1) || true
assert_eq "$EXPECTED" "$ACTUAL"
# Test non numerical values
EXPECTED="'gP' requires integer values"
ACTUAL=$("$PROGRAM" -e '[ foo ] 5 gP' 2>&1) || true
assert_eq "$EXPECTED" "$ACTUAL"
}
# vim: ts=4 sw=4 softtabstop=4 expandtab:

20
tests/test_ravg Normal file
View File

@ -0,0 +1,20 @@
#!/bin/sh
utest() {
PROGRAM="$PWD/build/dc"
EXPECTED="5"
ACTUAL=$("$PROGRAM" -e '1 2 3 4 5 6 7 8 9 SX SX SX SX SX SX SX SX SX gM p cX')
assert_eq "$EXPECTED" "$ACTUAL"
# Test undefined register
EXPECTED="Register 'X' is undefined"
ACTUAL=$("$PROGRAM" -e 'gM' 2>&1) || true
assert_eq "$EXPECTED" "$ACTUAL"
# Test empty stack
EXPECTED="The stack of register 'X' is empty"
ACTUAL=$("$PROGRAM" -e '0 5 :X gM' 2>&1) || true
assert_eq "$EXPECTED" "$ACTUAL"
}
# vim: ts=4 sw=4 softtabstop=4 expandtab:

38
tests/test_rlr Normal file
View File

@ -0,0 +1,38 @@
#!/bin/sh
utest() {
PROGRAM="$PWD/build/dc"
# Test undefined register
EXPECTED="Register 'X' is undefined"
ACTUAL=$("$PROGRAM" -e 'gL' 2>&1) || true
assert_eq "$EXPECTED" "$ACTUAL"
# Test empty stack
EXPECTED="The stack of register 'X' is empty"
ACTUAL=$("$PROGRAM" -e '0 5 :X gL' 2>&1) || true
assert_eq "$EXPECTED" "$ACTUAL"
# Test undefined register
EXPECTED="Register 'Y' is undefined"
ACTUAL=$("$PROGRAM" -e '5 SX gL' 2>&1) || true
assert_eq "$EXPECTED" "$ACTUAL"
# Test empty stack
EXPECTED="The stack of register 'Y' is empty"
ACTUAL=$("$PROGRAM" -e '5 SX 0 5 :Y gL' 2>&1) || true
assert_eq "$EXPECTED" "$ACTUAL"
# Test stacks of different sizes
EXPECTED="'X' and 'Y' registers must be of the same length"
ACTUAL=$("$PROGRAM" -e '5 SX 5 SY 7 SY gL' 2>&1) || true
assert_eq "$EXPECTED" "$ACTUAL"
# Test with real values
EXPECTED="1 0"
ACTUAL=$("$PROGRAM" -e "cX cY \
1 2 3 4 5 6 7 8 9 SX SX SX SX SX SX SX SX SX \
1 2 3 4 5 6 7 8 9 SY SY SY SY SY SY SY SY SY \
gL r p. r p")
assert_eq "$EXPECTED" "$ACTUAL"
}
# vim: ts=4 sw=4 softtabstop=4 expandtab:

20
tests/test_rsdev Normal file
View File

@ -0,0 +1,20 @@
#!/bin/sh
utest() {
PROGRAM="$PWD/build/dc"
EXPECTED="2.7386"
ACTUAL=$("$PROGRAM" -e '4 k 1 2 3 4 5 6 7 8 9 SX SX SX SX SX SX SX SX SX gD p cX')
assert_eq "$EXPECTED" "$ACTUAL"
# Test undefined register
EXPECTED="Register 'X' is undefined"
ACTUAL=$("$PROGRAM" -e 'gD' 2>&1) || true
assert_eq "$EXPECTED" "$ACTUAL"
# Test empty stack
EXPECTED="The stack of register 'X' is empty"
ACTUAL=$("$PROGRAM" -e '0 5 :X gD' 2>&1) || true
assert_eq "$EXPECTED" "$ACTUAL"
}
# vim: ts=4 sw=4 softtabstop=4 expandtab:

20
tests/test_rsum Normal file
View File

@ -0,0 +1,20 @@
#!/bin/sh
utest() {
PROGRAM="$PWD/build/dc"
EXPECTED="45"
ACTUAL=$("$PROGRAM" -e '1 2 3 4 5 6 7 8 9 SX SX SX SX SX SX SX SX SX gs p cX')
assert_eq "$EXPECTED" "$ACTUAL"
# Test undefined register
EXPECTED="Register 'X' is undefined"
ACTUAL=$("$PROGRAM" -e 'gs' 2>&1) || true
assert_eq "$EXPECTED" "$ACTUAL"
# Test empty stack
EXPECTED="The stack of register 'X' is empty"
ACTUAL=$("$PROGRAM" -e '0 5 :X gs' 2>&1) || true
assert_eq "$EXPECTED" "$ACTUAL"
}
# vim: ts=4 sw=4 softtabstop=4 expandtab:

20
tests/test_rsumx Normal file
View File

@ -0,0 +1,20 @@
#!/bin/sh
utest() {
PROGRAM="$PWD/build/dc"
EXPECTED="285"
ACTUAL=$("$PROGRAM" -e '1 2 3 4 5 6 7 8 9 SX SX SX SX SX SX SX SX SX gS p cX')
assert_eq "$EXPECTED" "$ACTUAL"
# Test undefined register
EXPECTED="Register 'X' is undefined"
ACTUAL=$("$PROGRAM" -e 'gS' 2>&1) || true
assert_eq "$EXPECTED" "$ACTUAL"
# Test empty stack
EXPECTED="The stack of register 'X' is empty"
ACTUAL=$("$PROGRAM" -e '0 5 :X gS' 2>&1) || true
assert_eq "$EXPECTED" "$ACTUAL"
}
# vim: ts=4 sw=4 softtabstop=4 expandtab: