v.shape # (x, y, z, examples)
v_flattened = v.reshape(x * y * z, examples)
v_flattened # (x * y * z, examples)
# 0-th data を除いて flattening する# ドキュメントにも書いてないように見えるけど...?# The "-1" makes reshape flatten the remaining dimensions
ndarray.reshape(ndarray.shape[0], -1)
Axes are defined for arrays with more than one dimension. A 2-dimensional array has two corresponding axes: the first running vertically downwards across rows (axis 0), and the second running horizontally across columns (axis 1).
When operating on two arrays, NumPy compares their shapes element-wise. It starts with the trailing dimensions and works its way forward. Two dimensions are compatible when
1. they are equal, or
2. one of them is 1
a = np.array([[1, 2, 3],
[4, 5, 6]])
a + 1#[[2, 3, 4]# [5, 6, 7]]
b = np.array([[100],
[100]])
a + b
#[[101, 102, 103]# [104, 105, 106]]
vectorization
行列計算には numpy.dot 等を使い、explicit for-loop を使わない。
import numpy as np
a = np.random.rand(100000)
b = np.random.rand(100000)
c = np.dot(a, b)
# 早いし、ベクトルのサイズ増やしても速度は変わらない(O(1))# for loop だと O(N) の計算量になる。# numpy vectorization では内部で SIMD命令が(利用可能なら)使われるので高速
Aho and Corasick [AC75] presented a linear-time algorithm for this problem, based on an automata
approach. This algorithm serves as the basis for the UNIX tool fgrep. A linear-time algorithm is optimal
in the worst case, but as the regular string-searching algorithm by Boyer and Moore [BM77] demonstrated,
it is possible to actually skip a large portion of the text while searching, leading to faster than linear algorithms in the average case
https://www.researchgate.net/publication/2566530_A_Fast_Algorithm_For_Multi-Pattern_Searching
We do still have a few tricks up our sleeve though. For example, many Aho-Corasick implementations are built as-if they were tries with back-pointers for their failure transitions. We can actually do better than that. We can compile all of its failure transitions into a DFA with a transition table contiguous in memory. This means that every byte of input corresponds to a single lookup in the transition table to find the next state. We never have to waste time chasing pointers or walking more than one failure transition for any byte in the search text.
https://blog.burntsushi.net/ripgrep/