Purpose: The accurate prediction of intrafraction lung tumor motion is required to compensate for system latency in image-guided adaptive radiotherapy systems. The goal of this study was to identify an optimal prediction model that has a short learning period so that prediction and adaptation can commence soon after treatment begins, and requires minimal reoptimization for individual patients. Specifically, the feasibility of predicting tumor position using a combination of a generalized (i.e., averaged) neural network, optimized using historical patient data (i.e., tumor trajectories) obtained offline, coupled with the use of real-time online tumor positions (obtained during treatment delivery) was examined. Methods: A 3-layer perceptron neural network was implemented to predict tumor motion for a prediction horizon of 650 ms. A backpropagation algorithm and batch gradient descent approach were used to train the model. Twenty-seven 1-min lung tumor motion samples (selected from a CyberKnife patient dataset) were sampled at a rate of 7.5 Hz (0.133 s) to emulate the frame rate of an electronic portal imaging device (EPID). A sliding temporal window was used to sample the data for learning. The sliding window length was set to be equivalent to the first breathing cycle detected from each trajectory. Performing a parametric sweep, an averaged error surface of mean square errors (MSE) was obtained from the prediction responses of seven trajectories used for the training of the model (Group 1). An optimal input data size and number of hidden neurons were selected to represent the generalized model. To evaluate the prediction performance of the generalized model on unseen data, twenty tumor traces (Group 2) that were not involved in the training of the model were used for the leave-one-out cross-validation purposes. Results: An input data size of 35 samples (4.6 s) and 20 hidden neurons were selected for the generalized neural network. An average sliding window length of 28 data samples was used. The average initial learning period prior to the availability of the first predicted tumor position was 8.53 ± 1.03 s. Average mean absolute error (MAE) of 0.59 ± 0.13 mm and 0.56 ± 0.18 mm were obtained from Groups 1 and 2, respectively, giving an overall MAE of 0.57 ± 0.17 mm. Average root-mean-square-error (RMSE) of 0.67 ± 0.36 for all the traces (0.76 ± 0.34 mm, Group 1 and 0.63 ± 0.36 mm, Group 2), is comparable to previously published results. Prediction errors are mainly due to the irregular periodicities between cycles. Since the errors from Groups 1 and 2 are within the same range, it demonstrates that this model can generalize and predict on unseen data. Conclusions: This is a first attempt to use an averaged MSE error surface (obtained from the prediction of different patients’ tumor trajectories) to determine the parameters of a generalized neural network. This network could be deployed as a plug-and-play predictor for tumor trajectory during treatment delivery, eliminating the need for optimizing individual networks with pretreatment patient data. © 2017 American Association of Physicists in Medicine.